date:20101102

[R] thanks alot

2010-11-02 Thread cbbaniya

Hi,
Many many thanks to your such an elaborative effort and help.
Chitra

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drawing circles on a chart

2010-11-02 Thread Gabor Grothendieck

On Tue, Nov 2, 2010 at 10:58 PM, Santosh Srinivas
 wrote:
> Thanks Gabor. I used melt to transform the data and plot using balloonplot.
>
> tData <- structure(list(A = c(0.2, 0.13, 0.05, 0.1, 0.02, 0.18, 0.09, 0.06,
>  0.13), B = c(0.15, 0.06, 0.09, 0.02, 0.03, 0.12, 0.01, 0.15, 0.06), C
>  = c(-0.1, 0, -0.07, -0.06, -0.05, -0.05, -0.06, -0.08, -0.07), D =
>  c(-0.15, -0.05, -0.1, -0.03, -0.13, -0.04, -0.1, -0.04, -0.15), E =
>  c(-0.17, -0.16, -0.08, -0.07, -0.09, -0.14, -0.1, -0.05, 0)), .Names =
>  c("A", "B", "C", "D", "E"), class = "data.frame", row.names = c(NA,
>  -9L))
>
> tData$Period <- rownames(tData)
>
> tData.m <- melt(tData)
>
> # need to find a way to adjust the color for -ve values
> balloonplot(tData.m$Period,tData.m$variable,abs(tData.m$value))
>

You can also try fixing up this where TS is dput object:

mat <- as.matrix(TS)
plot(col(TS) ~ row(TS), cex = 5 * (mat - min(mat)) / diff(range(mat)),
col = 1 + (mat > 0))



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculate the weithed mean of a matrix

2010-11-02 Thread sirus


Hello every body so I am writing the FCM algorithm in R, but I am  wondring
about the efficiency,
When calculating the centers using the formula
http://r.789695.n4.nabble.com/file/n3024803/ComputeCenters.png  

where here each X_i is a vector and U_ij are in a matrix,
in other words
I want to multiply each vector in X by a weight and then sum them up and
calculate the mean.

Thank you in advance for your help 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Calculate-the-weithed-mean-of-a-matrix-tp3024803p3024803.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory allocation problem

2010-11-02 Thread Peter Langfelder

Oops,  I missed that you only have 4GB of memory... but since R is
apparently capable of using almost 10GB, either you actually have more
RAM, or the system is swapping some data to disk. Increasing memory
use in R might still help, but also may lead to a situation where the
system waits forever for data to be swapped to and from the disk.

Peter

On Tue, Nov 2, 2010 at 7:36 PM, Peter Langfelder
 wrote:
> You have (almost) exhausted the 10GB you limited R to (that's what the
> memory.size() tells you). Increase memory.limit (if you have more RAM,
> use memory.limit(15000) for 15GB etc), or remove large data objects
> from you session. Use rm(object), the issue garbage collection gc().
> Sometimes garbage collection may solve the problem on its own.
>
> Peter
>
>
> On Tue, Nov 2, 2010 at 5:55 PM, Lorenzo Cattarino  
> wrote:
>> I forgot to mention that I am using windows 7 (64-bit) and the R version
>> 2.11.1 (64-bit)
>>
>>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drawing circles on a chart

2010-11-02 Thread Santosh Srinivas

Thanks Gabor. I used melt to transform the data and plot using balloonplot.

tData <- structure(list(A = c(0.2, 0.13, 0.05, 0.1, 0.02, 0.18, 0.09, 0.06, 
 0.13), B = c(0.15, 0.06, 0.09, 0.02, 0.03, 0.12, 0.01, 0.15, 0.06), C 
 = c(-0.1, 0, -0.07, -0.06, -0.05, -0.05, -0.06, -0.08, -0.07), D = 
 c(-0.15, -0.05, -0.1, -0.03, -0.13, -0.04, -0.1, -0.04, -0.15), E = 
 c(-0.17, -0.16, -0.08, -0.07, -0.09, -0.14, -0.1, -0.05, 0)), .Names = 
 c("A", "B", "C", "D", "E"), class = "data.frame", row.names = c(NA,
 -9L))

tData$Period <- rownames(tData)

tData.m <- melt(tData)

# need to find a way to adjust the color for -ve values
balloonplot(tData.m$Period,tData.m$variable,abs(tData.m$value))


-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: 03 November 2010 07:51
To: Santosh Srinivas
Cc: r-help@r-project.org
Subject: Re: [R] Drawing circles on a chart

On Tue, Nov 2, 2010 at 10:07 PM, Santosh Srinivas
 wrote:
> Dear Group,
> I have the following data matrix which is a timeseries.
>
>> dput(tData)
> structure(list(A = c(0.2, 0.13, 0.05, 0.1, 0.02, 0.18, 0.09,
> 0.06, 0.13), B = c(0.15, 0.06, 0.09, 0.02, 0.03, 0.12, 0.01,
> 0.15, 0.06), C = c(-0.1, 0, -0.07, -0.06, -0.05, -0.05, -0.06,
> -0.08, -0.07), D = c(-0.15, -0.05, -0.1, -0.03, -0.13, -0.04,
> -0.1, -0.04, -0.15), E = c(-0.17, -0.16, -0.08, -0.07, -0.09,
> -0.14, -0.1, -0.05, 0)), .Names = c("A", "B", "C", "D", "E"), class =
> "data.frame", row.names = c(NA,
> -9L))
>
>
> I am trying to display this data in a graphic. The values vary from -0.2
to
> +0.2
> There should be a table with 5 Rows and 9 Columns. Rows labeled A to E and
> Columns labeled 1 to 9.
> Inside each cell there should be a circle (sphere preferable) with radius
of
> mod(data value). The color should be either red or green depending on -ve
or
> +ve and the intensity should be based on the value of the datapoint.
>

See balloonplot in the gplots package.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory allocation problem

2010-11-02 Thread David Winsemius

Restart your computer. (Yeah, I know that what the help-desk always  
says.)

Start R before doing anything else.

Then run your code in a clean session. Check ls() oafter starte up to  
make sure you don't have a bunch f useless stuff in your .Rdata  
file.   Don't load anything that is not germane to this problem.  Use  
this function to see what sort of space issues you might have after  
loading objects:


 getsizes <- function() {z <- sapply(ls(envir=globalenv()),
function(x) object.size(get(x)))
   (tmp <- as.matrix(rev(sort(z))[1:10]))}

Then run your code.

--
David.

On Nov 2, 2010, at 10:13 PM, Lorenzo Cattarino wrote:


I would also like to include details on my R version




version  _


platform   x86_64-pc-mingw32
arch   x86_64

os mingw32
system x86_64, mingw32
status
major  2
minor  11.1
year   2010
month  05
day31
svn rev52157
language   R
version.string R version 2.11.1 (2010-05-31)

from FAQ 2.9
(http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-b
e-a-limit-on-the-memory-it-uses_0021
 ) it says that:
"For a 64-bit build, the default is the amount of RAM"

So in my case the amount of RAM would be 4 GB. R should be able to
allocate a vector of size 5 Mb without me typing any command (either  
as

memory.limit() or appended string in the target path), is that right?



From: Lorenzo Cattarino
Sent: Wednesday, 3 November 2010 10:55 AM
To: 'r-help@r-project.org'
Subject: memory allocation problem



I forgot to mention that I am using windows 7 (64-bit) and the R  
version

2.11.1 (64-bit)



From: Lorenzo Cattarino

I am trying to run a non linear parameter optimization using the
function optim() and I have problems regarding memory allocation.

My data are in a dataframe with 9 columns. There are 656100 rows.


head(org_results)


comb.id   p H1 H2 Range Rep no.steps  dist aver.hab.amount

1   1   0.1  0  0 11000
0.2528321  0.1393901

2   1   0.1  0  0 11000
0.4605934  0.1011841

3   1   0.1  0  0 11004
3.4273670  0.1052789

4   1   0.1  0  0 11004
2.8766364  0.1022138

5   1   0.1  0  0 11000
0.3496872  0.1041056

6   1   0.1  0  0 11000
0.1050840  0.3572036


est_coeff <- optim(coeff,SS, steps=org_results$no.steps,

Range=org_results$Range, H1=org_results$H1, H2=org_results$H2,
p=org_results$p)

Error: cannot allocate vector of size 5.0 Mb

In addition: Warning messages:

1: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

2: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

3: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

4: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)


memory.size()


[1] 9978.19


memory.limit()


[1] 1





I know that I am not sending reproducible codes but I was hoping that
you could help me understand what is going on. I set a maximum limit  
of
1 mega byte (by writing this string --max-mem-size=1M after  
the

target path, right click on R icon, shortcut tab). And R is telling me
that it cannot allocate a vector of size 5 Mb???




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory allocation problem

2010-11-02 Thread Peter Langfelder

You have (almost) exhausted the 10GB you limited R to (that's what the
memory.size() tells you). Increase memory.limit (if you have more RAM,
use memory.limit(15000) for 15GB etc), or remove large data objects
from you session. Use rm(object), the issue garbage collection gc().
Sometimes garbage collection may solve the problem on its own.

Peter

On Tue, Nov 2, 2010 at 5:55 PM, Lorenzo Cattarino  wrote:
> I forgot to mention that I am using windows 7 (64-bit) and the R version
> 2.11.1 (64-bit)
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] memory allocation problem

2010-11-02 Thread Lorenzo Cattarino

I would also like to include details on my R version

 

> version

   _

platform   x86_64-pc-mingw32

arch   x86_64   

os mingw32  

system x86_64, mingw32  

status  

major  2

minor  11.1 

year   2010 

month  05   

day31   

svn rev52157

language   R

version.string R version 2.11.1 (2010-05-31)

> 

 

from FAQ 2.9
(http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-to-b
e-a-limit-on-the-memory-it-uses_0021
 ) it says that:

 

"For a 64-bit build, the default is the amount of RAM"

 

So in my case the amount of RAM would be 4 GB. R should be able to
allocate a vector of size 5 Mb without me typing any command (either as
memory.limit() or appended string in the target path), is that right?

 

Thank you a lot

 

Lorenzo

 

From: Lorenzo Cattarino 
Sent: Wednesday, 3 November 2010 10:55 AM
To: 'r-help@r-project.org'
Subject: memory allocation problem

 

I forgot to mention that I am using windows 7 (64-bit) and the R version
2.11.1 (64-bit)

 

Thank you 

 

Lorenzo

 

From: Lorenzo Cattarino 
Sent: Wednesday, 3 November 2010 10:52 AM
To: r-help@r-project.org
Subject: memory allocation problem

 

Hi R users 

 

I am trying to run a non linear parameter optimization using the
function optim() and I have problems regarding memory allocation.

 

My data are in a dataframe with 9 columns. There are 656100 rows.

>head(org_results)

 

  comb.id   p H1 H2 Range Rep no.steps  dist aver.hab.amount

1   1   0.1  0  0 11000
0.2528321  0.1393901

2   1   0.1  0  0 11000
0.4605934  0.1011841

3   1   0.1  0  0 11004
3.4273670  0.1052789

4   1   0.1  0  0 11004
2.8766364  0.1022138

5   1   0.1  0  0 11000
0.3496872  0.1041056

6   1   0.1  0  0 11000
0.1050840  0.3572036

> 

 

>est_coeff <- optim(coeff,SS, steps=org_results$no.steps,
Range=org_results$Range, H1=org_results$H1, H2=org_results$H2,
p=org_results$p)

 

Error: cannot allocate vector of size 5.0 Mb

In addition: Warning messages:

1: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

2: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

3: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

4: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

> 

 

> memory.size()

[1] 9978.19

> memory.limit()

[1] 1

> 

 

 

I know that I am not sending reproducible codes but I was hoping that
you could help me understand what is going on. I set a maximum limit of
1 mega byte (by writing this string --max-mem-size=1M after the
target path, right click on R icon, shortcut tab). And R is telling me
that it cannot allocate a vector of size 5 Mb??? 

 

Thank you for your help

 

Lorenzo


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drawing circles on a chart

2010-11-02 Thread Gabor Grothendieck

On Tue, Nov 2, 2010 at 10:07 PM, Santosh Srinivas
 wrote:
> Dear Group,
> I have the following data matrix which is a timeseries.
>
>> dput(tData)
> structure(list(A = c(0.2, 0.13, 0.05, 0.1, 0.02, 0.18, 0.09,
> 0.06, 0.13), B = c(0.15, 0.06, 0.09, 0.02, 0.03, 0.12, 0.01,
> 0.15, 0.06), C = c(-0.1, 0, -0.07, -0.06, -0.05, -0.05, -0.06,
> -0.08, -0.07), D = c(-0.15, -0.05, -0.1, -0.03, -0.13, -0.04,
> -0.1, -0.04, -0.15), E = c(-0.17, -0.16, -0.08, -0.07, -0.09,
> -0.14, -0.1, -0.05, 0)), .Names = c("A", "B", "C", "D", "E"), class =
> "data.frame", row.names = c(NA,
> -9L))
>
>
> I am trying to display this data in a graphic. The values vary from -0.2 to
> +0.2
> There should be a table with 5 Rows and 9 Columns. Rows labeled A to E and
> Columns labeled 1 to 9.
> Inside each cell there should be a circle (sphere preferable) with radius of
> mod(data value). The color should be either red or green depending on -ve or
> +ve and the intensity should be based on the value of the datapoint.
>

See balloonplot in the gplots package.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Drawing circles on a chart

2010-11-02 Thread Santosh Srinivas

Dear Group,
I have the following data matrix which is a timeseries. 

> dput(tData)
structure(list(A = c(0.2, 0.13, 0.05, 0.1, 0.02, 0.18, 0.09, 
0.06, 0.13), B = c(0.15, 0.06, 0.09, 0.02, 0.03, 0.12, 0.01, 
0.15, 0.06), C = c(-0.1, 0, -0.07, -0.06, -0.05, -0.05, -0.06, 
-0.08, -0.07), D = c(-0.15, -0.05, -0.1, -0.03, -0.13, -0.04, 
-0.1, -0.04, -0.15), E = c(-0.17, -0.16, -0.08, -0.07, -0.09, 
-0.14, -0.1, -0.05, 0)), .Names = c("A", "B", "C", "D", "E"), class =
"data.frame", row.names = c(NA, 
-9L))


I am trying to display this data in a graphic. The values vary from -0.2 to
+0.2
There should be a table with 5 Rows and 9 Columns. Rows labeled A to E and
Columns labeled 1 to 9.
Inside each cell there should be a circle (sphere preferable) with radius of
mod(data value). The color should be either red or green depending on -ve or
+ve and the intensity should be based on the value of the datapoint.

Any help on how to go about this?

Thanks,
S

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] memory allocation problem

2010-11-02 Thread Lorenzo Cattarino

Hi R users 

 

I am trying to run a non linear parameter optimization using the
function optim() and I have problems regarding memory allocation.

 

My data are in a dataframe with 9 columns. There are 656100 rows.

>head(org_results)

 

  comb.id   p H1 H2 Range Rep no.steps  dist aver.hab.amount

1   1   0.1  0  0 11000
0.2528321  0.1393901

2   1   0.1  0  0 11000
0.4605934  0.1011841

3   1   0.1  0  0 11004
3.4273670  0.1052789

4   1   0.1  0  0 11004
2.8766364  0.1022138

5   1   0.1  0  0 11000
0.3496872  0.1041056

6   1   0.1  0  0 11000
0.1050840  0.3572036

> 

 

>est_coeff <- optim(coeff,SS, steps=org_results$no.steps,
Range=org_results$Range, H1=org_results$H1, H2=org_results$H2,
p=org_results$p)

 

Error: cannot allocate vector of size 5.0 Mb

In addition: Warning messages:

1: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

2: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

3: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

4: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

> 

 

> memory.size()

[1] 9978.19

> memory.limit()

[1] 1

> 

 

 

I know that I am not sending reproducible codes but I was hoping that
you could help me understand what is going on. I set a maximum limit of
1 mega byte (by writing this string --max-mem-size=1M after the
target path, right click on R icon, shortcut tab). And R is telling me
that it cannot allocate a vector of size 5 Mb??? 

 

Thank you for your help

 

Lorenzo


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] memory allocation problem

2010-11-02 Thread Lorenzo Cattarino

I forgot to mention that I am using windows 7 (64-bit) and the R version
2.11.1 (64-bit)

 

Thank you 

 

Lorenzo

 

From: Lorenzo Cattarino 
Sent: Wednesday, 3 November 2010 10:52 AM
To: r-help@r-project.org
Subject: memory allocation problem

 

Hi R users 

 

I am trying to run a non linear parameter optimization using the
function optim() and I have problems regarding memory allocation.

 

My data are in a dataframe with 9 columns. There are 656100 rows.

>head(org_results)

 

  comb.id   p H1 H2 Range Rep no.steps  dist aver.hab.amount

1   1   0.1  0  0 11000
0.2528321  0.1393901

2   1   0.1  0  0 11000
0.4605934  0.1011841

3   1   0.1  0  0 11004
3.4273670  0.1052789

4   1   0.1  0  0 11004
2.8766364  0.1022138

5   1   0.1  0  0 11000
0.3496872  0.1041056

6   1   0.1  0  0 11000
0.1050840  0.3572036

> 

 

>est_coeff <- optim(coeff,SS, steps=org_results$no.steps,
Range=org_results$Range, H1=org_results$H1, H2=org_results$H2,
p=org_results$p)

 

Error: cannot allocate vector of size 5.0 Mb

In addition: Warning messages:

1: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

2: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

3: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

4: In optim(coeff, SS, steps = org_results$no.steps, Range =
org_results$Range,  : Reached total allocation of 1Mb: see
help(memory.size)

> 

 

> memory.size()

[1] 9978.19

> memory.limit()

[1] 1

> 

 

 

I know that I am not sending reproducible codes but I was hoping that
you could help me understand what is going on. I set a maximum limit of
1 mega byte (by writing this string --max-mem-size=1M after the
target path, right click on R icon, shortcut tab). And R is telling me
that it cannot allocate a vector of size 5 Mb??? 

 

Thank you for your help

 

Lorenzo


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rgl.snapshot() : no longer works?

2010-11-02 Thread Remko Duursma


Ok, thanks. I just found the new (?) function rgl.postscript() , which works
better for me anyway.

Remko



Duncan Murdoch-2 wrote:
> 
> On 02/11/2010 8:24 PM, Remko Duursma wrote:
>>
>> Hi all,
>>
>>> library(rgl)
>>> plot3d(1,1,1)
>>> snapshot3d("somefile.png")
>> Error in rgl.snapshot(...) :
>>pixmap save format not supported in this build
>>
>>
>> Why does this no longer work?
> 
> The build for 2.12.0 on CRAN doesn't have png support built in.  I'm 
> currently working with Uwe to fix this.
> 
> Duncan Murdoch
> 
>>
>> thanks,
>> Remko
>>
>>> sessionInfo()
>> R version 2.12.0 (2010-10-15)
>> Platform: i386-pc-mingw32/i386 (32-bit)
>>
>> locale:
>> [1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252
>> [3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
>> [5] LC_TIME=English_Australia.1252
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> other attached packages:
>> [1] YPLANTER2_0.1   LeafAngle_1.0.3 gpclib_1.5-1geometry_0.1-7
>> rgl_0.92.794
>>
>> loaded via a namespace (and not attached):
>> [1] tools_2.12.0
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/rgl-snapshot-no-longer-works-tp3024694p3024713.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package 'np' and point estimation with multiple predictors

2010-11-02 Thread Eileen Meyer


(disclaimer: I'm in physics, not stats... )

I have a multivariate problem.
One variable, call it R1, and 3 "predictor" variables, P1, P2, P3.
My goal is to take a load of training data (I know R1,P1,P2,P3 for about 
700 total points), and then predict R1 for a new set of data for which I 
have all the predictors.  Simple, no?


I understand how to calculate bandwidths, and I have a kind of 
bastardized way of getting the conditional distribution, i.e.,


f(R1|P1=0.8,P2=0.2,P3=2)

using

fitted(npudens(bw=bw,edat=newdata))

evaluating over a vector of R1.

I have then been using this "density" to get a maximum likelihood 
estimator of R1- I have no idea if that is really valid, and if anyone 
wants to yell at me go ahead, I want to do this the correct way and I'm 
sure I'm making it harder than it is.


Moving past that, the technical problem I am facing is getting a 
prediction interval from this.


There's npqreg, and I get how it works when you have one predictor, but 
what happens when you have many?


What I want to do is get the 0.05 and 0.95 quantile for a given 
P1,P2,P3. to use as my prediction interval.


Thanks,
EM

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about ggplot2

2010-11-02 Thread Shige Song

Dear Josh,

This is exactly what I want, thank you so much!

Best,
Shige

On Tue, Nov 2, 2010 at 8:25 PM, Joshua Wiley  wrote:
> Dear Shige,
>
> This is a feature that lets you view information about the specific
> data you are viewing.  If you merely want a visual adjustment, use
> coord_cartesian():
>
> year.plot + stat_summary(fun.y = "mean", geom = "line") +
>  coord_cartesian(xlim = c(0, 0.1))
>
> HTH,
>
> Josh
>
> On Tue, Nov 2, 2010 at 5:20 PM, Shige Song  wrote:
>> Dear Josh and Abhijit,
>>
>> Thanks for the help. The interesting thing is that the option "limits
>> = c(0, .1)" or "ylim(0,0.1)" also eliminates cases whose values are
>> greater than 0.1 and report missing values, which is not what I want.
>> Is there a way to keep all the cases for the computation of the
>> summary statistics and change the y limits in the final graph?
>>
>> Shige
>>
>> On Tue, Nov 2, 2010 at 10:33 AM, Joshua Wiley  wrote:
>>> Dear Shige,
>>>
>>> You can use scale_y_continuous() to achieve this.
>>>
>>> year.plot <- ggplot(d, aes(year, rate))
>>> year.plot + stat_summary(fun.y = "mean", geom = "line") +
>>>  scale_y_continuous(limits = c(0, .1))
>>>
>>> where limits may be whatever you like for the y axis.
>>>
>>> Cheers,
>>>
>>> Josh
>>>
>>> On Tue, Nov 2, 2010 at 6:57 AM, Shige Song  wrote:
 Dear All,

 I am trying to graph a simple scatter plot where the x axis is year
 and the y axis is a percentage (percentage of infant death). Instead
 of plotting the raw data, I want to plot summary statistics such as
 mean and median. Here is the problem: the value range of y is between
 0 and 1, but since infant death is a rare event, the mean and median
 is very low (something like 5%), which shows up as a horizontal line
 at the bottom of the figure. My question is: how do I change the scale
 of the y-axis so that it does not have the range between 0 and 1 but
 between 0 and 0.1? Many thanks.

 By the way, I am using ggplot2, and here is my code:

 ---
 year.plot <- ggplot(d, aes(year, rate))
 year.plot + stat_summary(fun.y = "mean", geom = "line")
 ---

 Best,
 Shige

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>
>>>
>>>
>>> --
>>> Joshua Wiley
>>> Ph.D. Student, Health Psychology
>>> University of California, Los Angeles
>>> http://www.joshuawiley.com/
>>>
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rgl.snapshot() : no longer works?

2010-11-02 Thread Duncan Murdoch


On 02/11/2010 8:24 PM, Remko Duursma wrote:


Hi all,


library(rgl)
plot3d(1,1,1)
snapshot3d("somefile.png")

Error in rgl.snapshot(...) :
   pixmap save format not supported in this build


Why does this no longer work?


The build for 2.12.0 on CRAN doesn't have png support built in.  I'm 
currently working with Uwe to fix this.


Duncan Murdoch



thanks,
Remko


sessionInfo()

R version 2.12.0 (2010-10-15)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C
[5] LC_TIME=English_Australia.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] YPLANTER2_0.1   LeafAngle_1.0.3 gpclib_1.5-1geometry_0.1-7
rgl_0.92.794

loaded via a namespace (and not attached):
[1] tools_2.12.0


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about ggplot2

2010-11-02 Thread Joshua Wiley

Dear Shige,

This is a feature that lets you view information about the specific
data you are viewing.  If you merely want a visual adjustment, use
coord_cartesian():

year.plot + stat_summary(fun.y = "mean", geom = "line") +
  coord_cartesian(xlim = c(0, 0.1))

HTH,

Josh

On Tue, Nov 2, 2010 at 5:20 PM, Shige Song  wrote:
> Dear Josh and Abhijit,
>
> Thanks for the help. The interesting thing is that the option "limits
> = c(0, .1)" or "ylim(0,0.1)" also eliminates cases whose values are
> greater than 0.1 and report missing values, which is not what I want.
> Is there a way to keep all the cases for the computation of the
> summary statistics and change the y limits in the final graph?
>
> Shige
>
> On Tue, Nov 2, 2010 at 10:33 AM, Joshua Wiley  wrote:
>> Dear Shige,
>>
>> You can use scale_y_continuous() to achieve this.
>>
>> year.plot <- ggplot(d, aes(year, rate))
>> year.plot + stat_summary(fun.y = "mean", geom = "line") +
>>  scale_y_continuous(limits = c(0, .1))
>>
>> where limits may be whatever you like for the y axis.
>>
>> Cheers,
>>
>> Josh
>>
>> On Tue, Nov 2, 2010 at 6:57 AM, Shige Song  wrote:
>>> Dear All,
>>>
>>> I am trying to graph a simple scatter plot where the x axis is year
>>> and the y axis is a percentage (percentage of infant death). Instead
>>> of plotting the raw data, I want to plot summary statistics such as
>>> mean and median. Here is the problem: the value range of y is between
>>> 0 and 1, but since infant death is a rare event, the mean and median
>>> is very low (something like 5%), which shows up as a horizontal line
>>> at the bottom of the figure. My question is: how do I change the scale
>>> of the y-axis so that it does not have the range between 0 and 1 but
>>> between 0 and 0.1? Many thanks.
>>>
>>> By the way, I am using ggplot2, and here is my code:
>>>
>>> ---
>>> year.plot <- ggplot(d, aes(year, rate))
>>> year.plot + stat_summary(fun.y = "mean", geom = "line")
>>> ---
>>>
>>> Best,
>>> Shige
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Joshua Wiley
>> Ph.D. Student, Health Psychology
>> University of California, Los Angeles
>> http://www.joshuawiley.com/
>>
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rgl.snapshot() : no longer works?

2010-11-02 Thread Remko Duursma


Hi all,

> library(rgl)
> plot3d(1,1,1)
> snapshot3d("somefile.png")
Error in rgl.snapshot(...) : 
  pixmap save format not supported in this build


Why does this no longer work?

thanks,
Remko

> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252   
[3] LC_MONETARY=English_Australia.1252 LC_NUMERIC=C  
[5] LC_TIME=English_Australia.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] YPLANTER2_0.1   LeafAngle_1.0.3 gpclib_1.5-1geometry_0.1-7 
rgl_0.92.794   

loaded via a namespace (and not attached):
[1] tools_2.12.0
-- 
View this message in context: 
http://r.789695.n4.nabble.com/rgl-snapshot-no-longer-works-tp3024694p3024694.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about ggplot2

2010-11-02 Thread Shige Song

Dear Josh and Abhijit,

Thanks for the help. The interesting thing is that the option "limits
= c(0, .1)" or "ylim(0,0.1)" also eliminates cases whose values are
greater than 0.1 and report missing values, which is not what I want.
Is there a way to keep all the cases for the computation of the
summary statistics and change the y limits in the final graph?

Shige

On Tue, Nov 2, 2010 at 10:33 AM, Joshua Wiley  wrote:
> Dear Shige,
>
> You can use scale_y_continuous() to achieve this.
>
> year.plot <- ggplot(d, aes(year, rate))
> year.plot + stat_summary(fun.y = "mean", geom = "line") +
>  scale_y_continuous(limits = c(0, .1))
>
> where limits may be whatever you like for the y axis.
>
> Cheers,
>
> Josh
>
> On Tue, Nov 2, 2010 at 6:57 AM, Shige Song  wrote:
>> Dear All,
>>
>> I am trying to graph a simple scatter plot where the x axis is year
>> and the y axis is a percentage (percentage of infant death). Instead
>> of plotting the raw data, I want to plot summary statistics such as
>> mean and median. Here is the problem: the value range of y is between
>> 0 and 1, but since infant death is a rare event, the mean and median
>> is very low (something like 5%), which shows up as a horizontal line
>> at the bottom of the figure. My question is: how do I change the scale
>> of the y-axis so that it does not have the range between 0 and 1 but
>> between 0 and 0.1? Many thanks.
>>
>> By the way, I am using ggplot2, and here is my code:
>>
>> ---
>> year.plot <- ggplot(d, aes(year, rate))
>> year.plot + stat_summary(fun.y = "mean", geom = "line")
>> ---
>>
>> Best,
>> Shige
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://www.joshuawiley.com/
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] bugs and misfeatures in polr(MASS).... fixed!

2010-11-02 Thread Mr Timothy James BENHAM

In polr.R the (several) functions gmin and fmin contain the code

> theta <- beta[pc + 1L:q]
> gamm <- c(-100, cumsum(c(theta[1L], exp(theta[-1L]))), 100)

That's bad. There's no reason to suppose beta[pc+1L] is larger than
-100 or that the cumulative sum is smaller than 100. For practical
datasets those assumptions are frequently violated, causing the
optimization to fail. A work-around is to center the explanatory
variables. This helps keep the zetas small.

The correct approach is to use the values -Inf and Inf as the first
and last cut points. The functions plogis, dnorm, etc all behave
correctly when the input is one of these values. The dgumbel function
does not, returning NaN for -Inf. Correct this as follows

dgumbel <- function (x, loc = 0, scale = 1, log = FALSE)
{
x <- (x - loc)/scale
d <- log(1/scale) - x - exp(-x)
d[is.nan(d)] <- -Inf# -tjb
if (!log) exp(d) else d
}

The documentation states

>start: initial values for the parameters.  This is in the format
>   'c(coefficients, zeta)': see the Values section. 

The relevant code is

>   s0 <- if(pc > 0) c(start[seq_len(pc+1)], diff(start[-seq_len(pc)]))
>   else c(start[1L], diff(start))

This doesn't take the logs of the differences as required to repose
the zetas into the form used in the optimization. The fix is
obvious. polr.fit has the same problem which is responsible for
summary() frequently failing when the Hessian is not provided.

I'm not convinced the t values reported by summary() are
reliable. I've noticed that a one dimensional linear transformation
the independent variables can cause the reported t values to change by
a factor of more than 100, which doesn't seem right.

--tjb

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] class changed after execution with sqldf

2010-11-02 Thread Gabor Grothendieck

On Tue, Nov 2, 2010 at 7:01 PM, GL  wrote:
>
> Forgot to mention. This works in the PC implementation of R. The results I'm
> seeing here are in Mac OS X with X11 and tcl/tk installed.

Could you provide a minimal reproducible example that illustrates the problem.

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Display of NAs in character columns of a data frame under fix() or edit().

2010-11-02 Thread Rolf Turner


Thanks Peter.  Makes sense.  But I would like to point out
that it *is* possible to distinguish between ``NA'' meaning
North America and ``NA'' meaning ``missing value'' in *ordinary*
printing of character vectors.  E.g.

> x <- c("Europe","Africa","NA",NA,"SA","Antarctica")
> x
[1] "Europe" "Africa" "NA" NA   "SA"
[6] "Antarctica"
> sum(is.na(x))
[1] 1

cheers,

Rolf

On 3/11/2010, at 11:45 AM, Peter Dalgaard wrote:

> On 11/02/2010 09:45 PM, Rolf Turner wrote:
>> 
>> Example:
>> 
>>  xxx <- data.frame(x=1:26,y=letters)
>>  xxx$x[c(2,4,6,8)] <- NA
>>  xxx$y[c(1,3,5,7)] <- NA
>> 
>>  yyy <- edit(yyy)
>> 
>> The missing values in xxx$y appear as blanks in the spreadsheet window that
>> appears, whereas the missing values in the numeric column "x" appear as "NA"
>> (as I would expect).
>> 
>> Is this a bug or a feature?
> 
> Probably feature, How would you enter abbreviations for North America,
> Noradrenaline, Neil Adams, etc...? On the other hand, it is currently
> impossible to make a field blank.
> 
> Actually, the whole edit() interface is a bit of a long-standing bug.
> It's been with us "forever" (as far as I remember, the spreadsheet
> interface actually predates data frames in R). It was constructed using
> very basic GUI elements on Windows and X11, and it never _quite_ did
> what you'd want it to do.
> 
> Ideas about how to do better seem to have gotten stuck in indecision
> about which graphical toolkit to use. The Rcmdr has a data viewer (but
> not editor) written with the Tcl/Tk interface, which might be a starting
> point.
> 
> -- 
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] class changed after execution with sqldf

2010-11-02 Thread GL


Forgot to mention. This works in the PC implementation of R. The results I'm
seeing here are in Mac OS X with X11 and tcl/tk installed. 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/class-changed-after-execution-with-sqldf-tp3024592p3024602.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] class changed after execution with sqldf

2010-11-02 Thread GL


When I run sqldf to merge two datasets, it's changing the Date (class date)
to a numeric value (class factor). Not sure why. Appreciate any insight.
Console output for two datasets and the merged dataset (via sqldf) listed
below.


> summary(df.aggregate)
  Date Hourx
 Min.   :2010-07-01   0  :  64   Min.   : 0.00  
 1st Qu.:2010-07-25   1  :  64   1st Qu.: 1.00  
 Median :2010-08-16   2  :  64   Median : 9.00  
 Mean   :2010-08-16   3  :  64   Mean   :11.77  
 3rd Qu.:2010-09-08   4  :  64   3rd Qu.:23.00  
 Max.   :2010-09-30   5  :  64   Max.   :32.00  
  (Other):1152  
> class(df.aggregate$Date)
[1] "Date"
> summary(df.possible.combos)
  Date Hour  
 Min.   :2010-07-01   Min.   : 0.00  
 1st Qu.:2010-07-25   1st Qu.: 5.75  
 Median :2010-08-16   Median :11.50  
 Mean   :2010-08-16   Mean   :11.50  
 3rd Qu.:2010-09-08   3rd Qu.:17.25  
 Max.   :2010-09-30   Max.   :23.00  
> class(df.possible.combos$Date)
[1] "Date"
> #merge raw data and all possible combinations
>   df.final <- sqldf('select Date, Hour, x as RoomsInUse from
> "df.possible.combos"
+ left join "df.aggregate" using (Hour, Date)')
> summary(df.final)
  Date   Hour RoomsInUse   
 14791.0:  24   Min.   : 0.00   Min.   : 0.00  
 14792.0:  24   1st Qu.: 5.75   1st Qu.: 1.00  
 14796.0:  24   Median :11.50   Median : 9.00  
 14797.0:  24   Mean   :11.50   Mean   :11.77  
 14798.0:  24   3rd Qu.:17.25   3rd Qu.:23.00  
 14799.0:  24   Max.   :23.00   Max.   :32.00  
 (Other):1392  
> class(df.final$Date)
[1] "factor"

-- 
View this message in context: 
http://r.789695.n4.nabble.com/class-changed-after-execution-with-sqldf-tp3024592p3024592.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spliting first 10 words in a string

2010-11-02 Thread steven mosher

just merge the data.frames back together.

use merge or cbind()

cbind will be easier

DF1 <- data.frame(x,y,z)
DF2 <-data.frame(DF1$x) # copy a column
then you added columns to DF2

just put them back together

DF3 <-cbind(DF2,DF1$y,DF$z)

if you spend more time with R you will be able to do things like this
elegantly, but for
now This way will work and you will learn a bit about R.

As for counting instances of a string, I might suggest looking at the table
command

k <- c( "all", "but","all")
> table(k)
k
all but
  2   1

So you can do a table for each column in your dataframe

On Tue, Nov 2, 2010 at 12:53 PM, MatevÅ¾ PavliÄ 
wrote:

> Hi,
>
> Ok, i got this now. At least i think so. I got a data.frame with 15 fields,
> all other words have bee truncated. Which is what i want. But ia have that
> in a seperate data.frame from that one it was before (would be nice if it
> would be in the same ...)
>
> 'data.frame':   22801 obs. of  15 variables:
>  $ V1 : chr  "HUMUS" "SLABO" "MALO" "SLABO" ...
>  $ V2 : chr  "IN" "GRANULIRAN" "PREPEREL" "VEZAN" ...
>  $ V3 : chr  "HUMUSNA" "PEÅ ÄEN" "MELJAST" ",KONGLOMERAT," ...
>  $ V4 : chr  "GLINA" "PROD" "PROD" "P0ROZEN," ...
>  $ V5 : chr  "Z" "DO" "DO" "S" ...
>  $ V6 : chr  "MALO" "r" "r" "PLASTMI" ...
>  $ V7 : chr  "PODA," "=" "=" "GFs," ...
>  $ V8 : chr  "LAHKO" "8Q" "60mm," "SIVORJAV" ...
>  $ V9 : chr  "GNETNA," "mm," "S" "" ...
>  $ V10: chr  "RJAVA" "S" "PRODNIKI," "" ...
>  $ V11: chr  "" "PRODNIKI" "MALO" "" ...
>  $ V12: chr  "" "DO" "PEÅ ÄEN" "" ...
>  $ V13: chr  "" "R" "S" "" ...
>  $ V14: chr  "" "=" "TANKIMI" "" ...
>
> Now, i have another problem. Is it possible to count which word occours
> most often each field (V1, V2, V3, ...) and which one is the second and so
> on. Ideally to create a table for each field (V1, V2, V3, ...) with the word
> and thenumber of occuraces in that field (column) .
> I suppose it could be done in SQL, but what since i saw what R can do i
> guess this can be done here to?
>
> Thanks, m
>
> -Original Message-
> From: David Winsemius [mailto:dwinsem...@comcast.net]
> Sent: Tuesday, November 02, 2010 8:23 PM
> To: MatevÅ¾ PavliÄ
> Cc: Gaj Vidmar; r-h...@stat.math.ethz.ch
> Subject: Re: [R] spliting first 10 words in a string
>
>
> On Nov 2, 2010, at 3:01 PM, MatevÅ¾ PavliÄ wrote:
>
> > Hi all,
> >
> > Thanks for all the help. I managed to do it with what Gaj suggested
> > (Excel :().
> >
> > The last solution from David is also freat i just don't undestand why
> > R  put the words in 14 columns and thre rows?
>
> Because the maximum number of words was 14 and the fill argument was TRUE.
> There were three rows because there were three items in the supplied
> character vector.
>
> > I would like it to put just the first 10 words in source field to 10
> > diefferent destiantion fields, but the same row. And so on...is that
> > possible?
>
> I don't know what a destination field might be. Those are not R data types.
>
> This would trim the extra columns (in this example set to those greater
> than 8) by adding a lot of "NULL"'s to the end of a colClasses specification
>  at the expense of a warning message which can be
> ignored:
>
>  > read.table(textConnection(words), fill=T, colClasses =
> c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE )
>V1V2V3  V4V5V6V7  V8
> 1   I  have a columnn  with  text  that has
> 2   I would  like  to split these words  in
> 3 but  just first ten wordsin   the string.
> Warning message:
> In read.table(textConnection(words), fill = T, colClasses =
> c(rep("character",  :
>   cols = 14 != length(data) = 38
>
>
> If you want to assign the first column to a variable then just:
>  > first8 <- read.table(textConnection(words), fill=T, colClasses =
> c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE)  > var1
> <- first8[[1]]  > var1
> [1] "I"   "I"   "but"
>
> --
> David.
>
> >
> > Thank you, m
> > -Original Message-
> > From: r-help-boun...@r-project.org
> > [mailto:r-help-boun...@r-project.org
> > ] On Behalf Of David Winsemius
> > Sent: Tuesday, November 02, 2010 3:47 PM
> > To: Gaj Vidmar
> > Cc: r-h...@stat.math.ethz.ch
> > Subject: Re: [R] spliting first 10 words in a string
> >
> >
> > On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:
> >
> >> Though  in this list, in Excel it's just (literally!) five
> >> clicks away!
> >> (with the column in question selected) Data -> Text to Columns ->
> >> Delimited -> tick Space -> Finish Pa je! (~Voila in Slovenian) (then
> >> import back to R, keeping only the first 10 columns if so
> >> desired)
> >
> > You could do the same thing without needing to leave R. Just
> > read.table( textConnection(..), header=FALSE, fill=TRUE)
> >
> >> read.table(textConnection(words), fill=T)
> >V1V2V3  V4V5V6V7  V8   V9
> > V10  V11   V12 V13 V14
> > 1   I  have a columnn  with  text  that hasquite
> > a

Re: [R] Display of NAs in character columns of a data frame under fix() or edit().

2010-11-02 Thread Peter Dalgaard

On 11/02/2010 09:45 PM, Rolf Turner wrote:
> 
> Example:
> 
>   xxx <- data.frame(x=1:26,y=letters)
>   xxx$x[c(2,4,6,8)] <- NA
>   xxx$y[c(1,3,5,7)] <- NA
> 
>   yyy <- edit(yyy)
> 
> The missing values in xxx$y appear as blanks in the spreadsheet window that
> appears, whereas the missing values in the numeric column "x" appear as "NA"
> (as I would expect).
> 
> Is this a bug or a feature?

Probably feature, How would you enter abbreviations for North America,
Noradrenaline, Neil Adams, etc...? On the other hand, it is currently
impossible to make a field blank.

Actually, the whole edit() interface is a bit of a long-standing bug.
It's been with us "forever" (as far as I remember, the spreadsheet
interface actually predates data frames in R). It was constructed using
very basic GUI elements on Windows and X11, and it never _quite_ did
what you'd want it to do.

Ideas about how to do better seem to have gotten stuck in indecision
about which graphical toolkit to use. The Rcmdr has a data viewer (but
not editor) written with the Tcl/Tk interface, which might be a starting
point.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Dennis Murphy

Hi:

I don't know why, but it seems that in

bwplot(voice.part ~ height, data = singer,
main = "NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'
'pink' 'violet' 'brown' 'gold'",
fill=c("yellow","blue","green","red","pink","violet","brown","gold"))

the assignment of colors is offset by 3:

Levels: Bass 2 Bass 1 Tenor 2 Tenor 1 Alto 2 Alto 1 Soprano 2 Soprano 1
fillcol <- c("yellow","blue","green","red","pink","violet","brown","gold")

In the above plot,

yellow -> Bass 2  (1)
blue -> Tenor 1 (4)
green -> Soprano 2  (7)
red -> Bass 1 (10 mod 8 = 2)
pink -> Alto 2 (13 mod 8 = 5)
etc.

It's certainly curious.

Dennis


On Tue, Nov 2, 2010 at 2:51 PM, Rainer Hurling  wrote:

> On 02.11.2010 22:37 (UTC+1), David Winsemius wrote:
>
>>
>> On Nov 2, 2010, at 5:08 PM, Rainer Hurling wrote:
>>
>>  On 02.11.2010 21:43 (UTC+1), David Winsemius wrote:
>>>

 On Nov 2, 2010, at 4:07 PM, Rainer Hurling wrote:

 snipped quite a bit of talking past each otther

>
> Of course your example with eight colours works, too. But as you can
> see in the plot, the colours have different order then in the vector
> 'colors()[(2:9)*10]' itself. I expected the first box (bass2) coloured
> "bisque1", the second box (bass1) "blue4" and so on.
>

 Oh. Try putting the fill argument outside the panel and see if the panel
 handles it in the manner you expect:

 bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
 outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3' 'coral3'
 'cyan2' 'darkgray' 'darkorange", fill=colors()[(2:11)*10],
 panel = function(...) {
 panel.grid(v = -1, h = 0)
 panel.bwplot( ...)
 })
 bp3


> I hope, this explaination is a bit clearer than my preceding ones.
>

 And I hope my suggestion now "works".

>>>
>>> Thank you for the hint, that it works also outside of the panel. It
>>> looks like I missed the wood for trees here ;-)
>>>
>>> In your latest, special case the colours work. After having a nearer
>>> look at it I found that your colour vector has length 10 (2:11), and
>>> only the first eight colours are filled in the boxes.
>>>
>>
>> I don't know why the ordering only is irregularly preserved ...
>> apparently in situations where the number of colors is a multiple of 5.
>> Perhaps a question that Sarkar, Andrews or Ehlers can answer. I looked
>> at the code for bwplot and it uses panel.polygon for drawing the
>> rectangles. The colors and other graphical parameters are supposed to be
>> picked up from the box.rectangle settings in par.settings. (Trying to
>> set those alos failed.) I also looked at panel.polygon and do not see a
>> reason for the shuffling of colors.
>>
>
> I also hope that someone from 'inner circle' would have a look ;-)
>
>
>  Wrong order also:
>>  > bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
>> outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3' 'coral3'
>> 'cyan2' 'darkgray' 'darkorange", par.settings =
>> list(box.rectangle=list(fill=colors()[(2:9)*10])), horizontal=TRUE,
>> + panel = function(...) {
>> + panel.grid(v = -1, h = 0)
>> + panel.bwplot( ...)
>> + })
>>  > bp3
>>
>
> Yes, I tried to manipulate box.rectangle myself with also no success. I
> think, the design of panel.bwplot originally allows only for using one fill
> color (just a guess).
>
>
>  This seems to be reproducable:
>>>
>>> ### NOT WORKING: 8 colours in the not in order of given vector
>>> bwplot(voice.part ~ height, data = singer,
>>> main = "NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'
>>> 'pink' 'violet' 'brown' 'gold'",
>>> fill=c("yellow","blue","green","red","pink","violet","brown","gold"))
>>>
>>> ### WORKING: 10 (8+2*NA) colours in order of given vector
>>> bwplot(voice.part ~ height, data = singer,
>>> main = "RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 'pink'
>>> 'violet' 'brown' 'gold'",
>>> fill=c("yellow","blue","green","red","pink","violet","brown","gold",
>>> NA, NA))
>>>
>>> I really do not understand what is going on here,
>>>
>>
>> Me either.
>>
>
> Thank you so far. I am afraid I have to go to bed. In just a few hours I
> have to work for my employer again ...
>
>
>  Rainer
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] splitting First 10 words in a string

2010-11-02 Thread steven mosher

Line should be:

first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=nrow(
sent))

sorry cut and past error

On Tue, Nov 2, 2010 at 3:32 PM, steven mosher wrote:

>  That's easy you are confusing the dummy code I sent.
>
>  Do this:
>
>  lit<-read.csv("litologija.csv", sep=";", dec=".")
> sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE)
>
> first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=nrow(
> sent)
>
> I put the length of the vector to 10 just to do a dummy problem.
>
> Then do this:
>
> for(j in 1:nrow(sent) {
>
>   sent[j,2:11]<-strsplit(sent[j,1]," ")[[1]][1:10]
>
> }
>
>
> That will get you a result the crude brute force way.
>
> try that.
>
> Then you can learn sapply way. but first you need to learn R data
> structures.
>
>
>
>
>
> On Tue, Nov 2, 2010 at 1:47 PM, MatevÅ¾ PavliÄ 
> wrote:
>
>> Hi Steven,
>>
>>
>>
>> Thank you for the help. I get an error though when i do this :
>>
>>
>>
>> >lit<-read.csv("litologija.csv", sep=";", dec=".")
>>
>> >sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE)
>>
>> >str(sent)
>>
>> >sentV<-rep(sent,10)
>>
>> >str(sentV)
>>
>>
>>
>>
>> >first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)
>>
>> >DF
>> <-data.frame(Sentence=sent,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)
>>
>>
>>
>> Â»Error in data.frame(Sentence = sent, first, second, third, fourth,
>> fifth,  :
>>
>> arguments imply differing number of rows: 22928, 10Â«
>>
>>
>>
>> What am I doing wrong?
>>
>>
>>
>> Thnks, m
>>
>>
>>
>>
>>
>>
>>
>> *From:* steven mosher [mailto:mosherste...@gmail.com]
>> *Sent:* Tuesday, November 02, 2010 8:45 PM
>> *To:* David Winsemius
>> *Cc:* MatevÅ¾ PavliÄ; Gaj Vidmar; r-h...@stat.math.ethz.ch
>> *Subject:* Re: [R] spliting first 10 words in a string
>>
>>
>>
>>  Thanks david.
>>
>>
>>
>>   Matevz, maybe I can help explain by doing a very simple and brute force
>> approach
>>
>> as opposed to  the way david did it. But you should learn his methods.
>>
>>
>>
>> I will just do a subset of your problem and if you understand how it works
>> then you should
>>
>> be able to get something done and then make it more elegant.
>>
>>
>>
>> First, I simplify the problem by separating out the "sentence" column.
>>
>>
>>
>> You can do this with your data frame by simply doing this
>>
>>
>>
>> MySentence <-data.frame(sentence=yourbigDF$Opis,stringsAsFactors=FALSE)
>>
>>
>>
>> so I take your original data.frame (yourbigDF) and I just create a copy of
>> that one column
>>
>>  $Opis
>>
>>
>>
>> Later we can merge the two back together after I add 10 columns for the
>> words
>>
>>
>>
>>
>>
>> Lets make some dummy data with just 10 rows
>>
>>
>>
>>
>>
>>
>>
>>  sentence<- "this is a sentence with ten words or maybe more than ten
>> words"
>>
>>  sentV<-rep(sentence,10)
>>
>> # now I just made 10 rows of the same sentence
>>
>> # NEXT because I am going to create 10 new colums of 10 rows I create
>>
>> # 10 vectors> each is named and each has 10 elements For the rows.
>>
>> # they have NO DATA in them
>>
>>
>>
>>
>>  
>> first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)
>>
>>
>>
>> #Next I create a dataframe with Sentence in the first column and 10 blank
>> colums.
>>
>> # NOTE I use stringsAsFactors=False
>>
>>
>>
>>  DF
>> <-data.frame(Sentence=sentence,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)
>>
>>
>>
>> # This is what it would look like ( the first row)
>>
>> DF[1,]
>>
>>
>>
>> Sentence first second third fourth fifth sixth seventh eighth ninth tenth
>>
>> 1 this is a sentence with ten words or maybe more than ten words FALSE
>>  FALSE FALSE  FALSE FALSE FALSE   FALSE  FALSE FALSE FALSE
>>
>>
>>
>> Next, I will show you how to assign the first ten words to the 10 blank
>> columns
>>
>>
>>
>> DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]
>>
>>
>>
>> #DF[1,2:11]  selects the columns 2-11 of the first row
>>
>> #strsplit  returns the first 10 words [1:10] and place them in the
>> columsn2-11
>>
>>
>>
>> If you want to do this the slow way you can just loop through your
>> dataframe row by row
>>
>> or you can probably use apply.
>>
>>
>>
>> Make more sense?
>>
>> > DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]
>>
>> > DF[1,]
>>
>> Sentence first
>> second third   fourth fifth sixth seventh eighth ninth tenth
>>
>> 1 this is a sentence with ten words or maybe more than ten words  this
>> is a sentence  with   ten   words or maybe  more
>>
>> > DF[1,"first"]
>>
>> [1] "this"
>>
>>
>>
>> On Tue, Nov 2, 2010 at 12:22 PM, David Winsemius 
>> wrote:
>>
>>
>> On Nov 2, 2010, at 3:01 PM, MatevÅ¾ PavliÄ wrote:
>>
>> Hi all,
>>
>> Thanks for all the help. I managed to do it with what Gaj suggested (Excel
>> :().
>>
>> The last solution from David is also freat i just don't undestand why R
>>  put the w

Re: [R] splitting First 10 words in a string

2010-11-02 Thread steven mosher

 That's easy you are confusing the dummy code I sent.

 Do this:

 lit<-read.csv("litologija.csv", sep=";", dec=".")
sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE)
irst=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=nrow(
sent)

I put the length of the vector to 10 just to do a dummy problem.

Then do this:

for(j in 1:nrow(sent) {

  sent[j,2:11]<-strsplit(sent[j,1]," ")[[1]][1:10]

}


That will get you a result the crude brute force way.

try that.

Then you can learn sapply way. but first you need to learn R data
structures.





On Tue, Nov 2, 2010 at 1:47 PM, MatevÅ¾ PavliÄ wrote:

> Hi Steven,
>
>
>
> Thank you for the help. I get an error though when i do this :
>
>
>
> >lit<-read.csv("litologija.csv", sep=";", dec=".")
>
> >sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE)
>
> >str(sent)
>
> >sentV<-rep(sent,10)
>
> >str(sentV)
>
>
>
>
> >first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)
>
> >DF
> <-data.frame(Sentence=sent,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)
>
>
>
> Â»Error in data.frame(Sentence = sent, first, second, third, fourth, fifth,
> :
>
> arguments imply differing number of rows: 22928, 10Â«
>
>
>
> What am I doing wrong?
>
>
>
> Thnks, m
>
>
>
>
>
>
>
> *From:* steven mosher [mailto:mosherste...@gmail.com]
> *Sent:* Tuesday, November 02, 2010 8:45 PM
> *To:* David Winsemius
> *Cc:* MatevÅ¾ PavliÄ; Gaj Vidmar; r-h...@stat.math.ethz.ch
> *Subject:* Re: [R] spliting first 10 words in a string
>
>
>
>  Thanks david.
>
>
>
>   Matevz, maybe I can help explain by doing a very simple and brute force
> approach
>
> as opposed to  the way david did it. But you should learn his methods.
>
>
>
> I will just do a subset of your problem and if you understand how it works
> then you should
>
> be able to get something done and then make it more elegant.
>
>
>
> First, I simplify the problem by separating out the "sentence" column.
>
>
>
> You can do this with your data frame by simply doing this
>
>
>
> MySentence <-data.frame(sentence=yourbigDF$Opis,stringsAsFactors=FALSE)
>
>
>
> so I take your original data.frame (yourbigDF) and I just create a copy of
> that one column
>
>  $Opis
>
>
>
> Later we can merge the two back together after I add 10 columns for the
> words
>
>
>
>
>
> Lets make some dummy data with just 10 rows
>
>
>
>
>
>
>
>  sentence<- "this is a sentence with ten words or maybe more than ten
> words"
>
>  sentV<-rep(sentence,10)
>
> # now I just made 10 rows of the same sentence
>
> # NEXT because I am going to create 10 new colums of 10 rows I create
>
> # 10 vectors> each is named and each has 10 elements For the rows.
>
> # they have NO DATA in them
>
>
>
>
>  
> first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)
>
>
>
> #Next I create a dataframe with Sentence in the first column and 10 blank
> colums.
>
> # NOTE I use stringsAsFactors=False
>
>
>
>  DF
> <-data.frame(Sentence=sentence,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)
>
>
>
> # This is what it would look like ( the first row)
>
> DF[1,]
>
>
>
> Sentence first second third fourth fifth sixth seventh eighth ninth tenth
>
> 1 this is a sentence with ten words or maybe more than ten words FALSE
>  FALSE FALSE  FALSE FALSE FALSE   FALSE  FALSE FALSE FALSE
>
>
>
> Next, I will show you how to assign the first ten words to the 10 blank
> columns
>
>
>
> DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]
>
>
>
> #DF[1,2:11]  selects the columns 2-11 of the first row
>
> #strsplit  returns the first 10 words [1:10] and place them in the
> columsn2-11
>
>
>
> If you want to do this the slow way you can just loop through your
> dataframe row by row
>
> or you can probably use apply.
>
>
>
> Make more sense?
>
> > DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]
>
> > DF[1,]
>
> Sentence first
> second third   fourth fifth sixth seventh eighth ninth tenth
>
> 1 this is a sentence with ten words or maybe more than ten words  this
> is a sentence  with   ten   words or maybe  more
>
> > DF[1,"first"]
>
> [1] "this"
>
>
>
> On Tue, Nov 2, 2010 at 12:22 PM, David Winsemius 
> wrote:
>
>
> On Nov 2, 2010, at 3:01 PM, MatevÅ¾ PavliÄ wrote:
>
> Hi all,
>
> Thanks for all the help. I managed to do it with what Gaj suggested (Excel
> :().
>
> The last solution from David is also freat i just don't undestand why R
>  put the words in 14 columns and thre rows?
>
>
>
> Because the maximum number of words was 14 and the fill argument was TRUE.
> There were three rows because there were three items in the supplied
> character vector.
>
>
>
> I would like it to put just the first 10 words in source field to 10
> diefferent destiantion fields, but the same row. And so on...is that
> possible?
>
>
>
> I don't know what a destination field might be. Those are not R data types.
>

Re: [R] Line numbers in Sweave

2010-11-02 Thread Duncan Murdoch


On 02/11/2010 5:50 PM, Yihui Xie wrote:

Hi,

I thumbed through the source code Sweave.R but was unable to figure
out when (under what conditions) R will insert the line numbers to the
output. The R 2.12.0 news said:

 • Parsing errors detected during Sweave() processing will now be
   reported referencing their original location in the source file.

Do we have any options to turn off this reporting? Thanks!


Sure:  just don't include any syntax errors.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Rainer Hurling

On 02.11.2010 22:37 (UTC+1), David Winsemius wrote:

On Nov 2, 2010, at 5:08 PM, Rainer Hurling wrote:

On 02.11.2010 21:43 (UTC+1), David Winsemius wrote:

On Nov 2, 2010, at 4:07 PM, Rainer Hurling wrote:

snipped quite a bit of talking past each otther

Of course your example with eight colours works, too. But as you can
see in the plot, the colours have different order then in the vector
'colors()[(2:9)*10]' itself. I expected the first box (bass2) coloured
"bisque1", the second box (bass1) "blue4" and so on.

Oh. Try putting the fill argument outside the panel and see if the panel
handles it in the manner you expect:

bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3' 'coral3'
'cyan2' 'darkgray' 'darkorange", fill=colors()[(2:11)*10],
panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot( ...)
})
bp3

I hope, this explaination is a bit clearer than my preceding ones.

And I hope my suggestion now "works".

Thank you for the hint, that it works also outside of the panel. It
looks like I missed the wood for trees here ;-)

In your latest, special case the colours work. After having a nearer
look at it I found that your colour vector has length 10 (2:11), and
only the first eight colours are filled in the boxes.

I don't know why the ordering only is irregularly preserved ...
apparently in situations where the number of colors is a multiple of 5.
Perhaps a question that Sarkar, Andrews or Ehlers can answer. I looked
at the code for bwplot and it uses panel.polygon for drawing the
rectangles. The colors and other graphical parameters are supposed to be
picked up from the box.rectangle settings in par.settings. (Trying to
set those alos failed.) I also looked at panel.polygon and do not see a
reason for the shuffling of colors.

I also hope that someone from 'inner circle' would have a look ;-)

Wrong order also:
 > bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3' 'coral3'
'cyan2' 'darkgray' 'darkorange", par.settings =
list(box.rectangle=list(fill=colors()[(2:9)*10])), horizontal=TRUE,
+ panel = function(...) {
+ panel.grid(v = -1, h = 0)
+ panel.bwplot( ...)
+ })
 > bp3

Yes, I tried to manipulate box.rectangle myself with also no success. I 
think, the design of panel.bwplot originally allows only for using one 
fill color (just a guess).

This seems to be reproducable:

### NOT WORKING: 8 colours in the not in order of given vector
bwplot(voice.part ~ height, data = singer,
main = "NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'
'pink' 'violet' 'brown' 'gold'",
fill=c("yellow","blue","green","red","pink","violet","brown","gold"))

### WORKING: 10 (8+2*NA) colours in order of given vector
bwplot(voice.part ~ height, data = singer,
main = "RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 'pink'
'violet' 'brown' 'gold'",
fill=c("yellow","blue","green","red","pink","violet","brown","gold",
NA, NA))

I really do not understand what is going on here,

Me either.

Thank you so far. I am afraid I have to go to bed. In just a few hours I 
have to work for my employer again ...

Rainer

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Line numbers in Sweave

2010-11-02 Thread Yihui Xie

Hi,

I thumbed through the source code Sweave.R but was unable to figure
out when (under what conditions) R will insert the line numbers to the
output. The R 2.12.0 news said:

• Parsing errors detected during Sweave() processing will now be
  reported referencing their original location in the source file.

Do we have any options to turn off this reporting? Thanks!

Regards,
Yihui
--
Yihui Xie 
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] count different words in a field

2010-11-02 Thread David Winsemius

On Nov 2, 2010, at 5:11 PM, Matevž Pavlič wrote:

Hi all,

I started to ask this in the other post, but it is off topis...so  
here it is again.

I have a data.frame (created with the helpof this mail list) that  
looks like this :

? table
> tbl <- table(c("HUMUS", "SLABO", "MALO", "SLABO"))
> tbl[order(tbl)][1]
HUMUS
1

Just make a function that does this to a vector and use lapply(dfrm,  
func)  on the dataframe.

--
David.

'data.frame':   22801 obs. of  15 variables:

$ V1 : chr  "HUMUS" "SLABO" "MALO" "SLABO" ...

$ V2 : chr  "IN" "GRANULIRAN" "PREPEREL" "VEZAN" ...

$ V3 : chr  "HUMUSNA" "PE©ÈEN" "MELJAST" ",KONGLOMERAT," ...

$ V4 : chr  "GLINA" "PROD" "PROD" "P0ROZEN," ...

$ V5 : chr  "Z" "DO" "DO" "S" ...

$ V6 : chr  "MALO" "r" "r" "PLASTMI" ...

$ V7 : chr  "PODA," "=" "=" "GFs," ...

$ V8 : chr  "LAHKO" "8Q" "60mm," "SIVORJAV" ...

$ V9 : chr  "GNETNA," "mm," "S" "" ...

$ V10: chr  "RJAVA" "S" "PRODNIKI," "" ...

$ V11: chr  "" "PRODNIKI" "MALO" "" ...

$ V12: chr  "" "DO" "PE©ÈEN" "" ...

$ V13: chr  "" "R" "S" "" ...

$ V14: chr  "" "=" "TANKIMI" "" ...

Is it possible to count which word occours most often in each field  
(V1, V2, V3, ...) and which one is the second and so on. Ideally i  
would like to create a table for each field (V1, V2, V3, ...) with  
the prevailing word and the number of occurancies  of that word in  
that field (column) .

Hope that explains it ok...

Thank you, m

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread David Winsemius

On Nov 2, 2010, at 5:08 PM, Rainer Hurling wrote:

On 02.11.2010 21:43 (UTC+1), David Winsemius wrote:

On Nov 2, 2010, at 4:07 PM, Rainer Hurling wrote:

snipped quite a bit of talking past each otther

Of course your example with eight colours works, too. But as you can
see in the plot, the colours have different order then in the vector
'colors()[(2:9)*10]' itself. I expected the first box (bass2)  
coloured

"bisque1", the second box (bass1) "blue4" and so on.

Oh. Try putting the fill argument outside the panel and see if the  
panel

handles it in the manner you expect:

bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3'  
'coral3'

'cyan2' 'darkgray' 'darkorange", fill=colors()[(2:11)*10],
panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot( ...)
})
bp3

I hope, this explaination is a bit clearer than my preceding ones.

And I hope my suggestion now "works".

Thank you for the hint, that it works also outside of the panel. It  
looks like I missed the wood for trees here ;-)

In your latest, special case the colours work. After having a nearer  
look at it I found that your colour vector has length 10 (2:11), and  
only the first eight colours are filled in the boxes.

I don't know why the ordering only is irregularly preserved ...  
apparently in situations where the number of colors is a multiple of  
5. Perhaps a question that Sarkar, Andrews or Ehlers can answer. I  
looked at the code for bwplot and it uses panel.polygon for drawing  
the rectangles. The colors and other graphical parameters are supposed  
to be picked up from the box.rectangle settings in par.settings.  
(Trying to set those alos failed.)  I also looked at panel.polygon and  
do not see a reason for the shuffling of colors.

Wrong order also:
> bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg  
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3'  
'coral3' 'cyan2' 'darkgray' 'darkorange", par.settings =  
list(box.rectangle=list(fill=colors()[(2:9)*10])), horizontal=TRUE,

+  panel = function(...) {
+panel.grid(v = -1, h = 0)
+panel.bwplot( ...)
+   })
> bp3

This seems to be reproducable:

### NOT WORKING: 8 colours in the not in order of given vector
bwplot(voice.part ~ height, data = singer,
 main = "NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green'  
'red' 'pink' 'violet' 'brown' 'gold'",

 fill=c("yellow","blue","green","red","pink","violet","brown","gold"))

### WORKING: 10 (8+2*NA) colours in order of given vector
bwplot(voice.part ~ height, data = singer,
 main = "RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red'  
'pink' 'violet' 'brown' 'gold'",

fill=c("yellow","blue","green","red","pink","violet","brown","gold",  
NA, NA))

I really do not understand what is going on here,

Me either.

Rainer

David Winsemius, MD
West Hartford, CT

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] connecting points into a smooth curve

2010-11-02 Thread Greg Snow

In addition to the other responses you have received, the xspline function may 
also be of use.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
> project.org] On Behalf Of tooblue
> Sent: Monday, November 01, 2010 12:19 AM
> To: r-help@r-project.org
> Subject: [R] connecting points into a smooth curve
> 
> 
> If I have, say, five scatter points and want to connect them together
> into a
> smooth curve.
> I did plot(x,y,type="l"), but the graph is five segments connecting
> with
> each other, but not a smooth curve.
> I wonder if there is a line type that is a curve. Thanks!
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/connecting-
> points-into-a-smooth-curve-tp3021796p3021796.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] count different words in a field

2010-11-02 Thread Matevž Pavlič

Nevermind, i think summary() does this ...

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Matevž Pavlič
Sent: Tuesday, November 02, 2010 10:12 PM
To: r-help@r-project.org
Subject: [R] count different words in a field

Hi all, 

I started to ask this in the other post, but it is off topis...so here it is 
again.

I have a data.frame (created with the helpof this mail list) that looks like 
this :

'data.frame':   22801 obs. of  15 variables:

$ V1 : chr  "HUMUS" "SLABO" "MALO" "SLABO" ...

$ V2 : chr  "IN" "GRANULIRAN" "PREPEREL" "VEZAN" ...

$ V3 : chr  "HUMUSNA" "PEŠČEN" "MELJAST" ",KONGLOMERAT," ...

$ V4 : chr  "GLINA" "PROD" "PROD" "P0ROZEN," ...

$ V5 : chr  "Z" "DO" "DO" "S" ...

$ V6 : chr  "MALO" "r" "r" "PLASTMI" ...

$ V7 : chr  "PODA," "=" "=" "GFs," ...

$ V8 : chr  "LAHKO" "8Q" "60mm," "SIVORJAV" ...

$ V9 : chr  "GNETNA," "mm," "S" "" ...

$ V10: chr  "RJAVA" "S" "PRODNIKI," "" ...

$ V11: chr  "" "PRODNIKI" "MALO" "" ...

$ V12: chr  "" "DO" "PEŠČEN" "" ...

$ V13: chr  "" "R" "S" "" ...

$ V14: chr  "" "=" "TANKIMI" "" ...

Is it possible to count which word occours most often in each field (V1, V2, 
V3, ...) and which one is the second and so on. Ideally i would like to create 
a table for each field (V1, V2, V3, ...) with the prevailing word and the 
number of occurancies  of that word in that field (column) . 

Hope that explains it ok...

Thank you, m

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] count different words in a field

2010-11-02 Thread Matevž Pavlič

Hi all, 

 

I started to ask this in the other post, but it is off topis...so here it is 
again.

 

I have a data.frame (created with the helpof this mail list) that looks like 
this :

 

'data.frame':   22801 obs. of  15 variables:

$ V1 : chr  "HUMUS" "SLABO" "MALO" "SLABO" ...

$ V2 : chr  "IN" "GRANULIRAN" "PREPEREL" "VEZAN" ...

$ V3 : chr  "HUMUSNA" "PE©ÈEN" "MELJAST" ",KONGLOMERAT," ...

$ V4 : chr  "GLINA" "PROD" "PROD" "P0ROZEN," ...

$ V5 : chr  "Z" "DO" "DO" "S" ...

$ V6 : chr  "MALO" "r" "r" "PLASTMI" ...

$ V7 : chr  "PODA," "=" "=" "GFs," ...

$ V8 : chr  "LAHKO" "8Q" "60mm," "SIVORJAV" ...

$ V9 : chr  "GNETNA," "mm," "S" "" ...

$ V10: chr  "RJAVA" "S" "PRODNIKI," "" ...

$ V11: chr  "" "PRODNIKI" "MALO" "" ...

$ V12: chr  "" "DO" "PE©ÈEN" "" ...

$ V13: chr  "" "R" "S" "" ...

$ V14: chr  "" "=" "TANKIMI" "" ...

 

Is it possible to count which word occours most often in each field (V1, V2, 
V3, ...) and which one is the second and so on. Ideally i would like to create 
a table for each field (V1, V2, V3, ...) with the prevailing word and the 
number of occurancies  of that word in that field (column) . 

 

Hope that explains it ok...

 

Thank you, m

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Rainer Hurling


On 02.11.2010 21:43 (UTC+1), David Winsemius wrote:


On Nov 2, 2010, at 4:07 PM, Rainer Hurling wrote:

snipped quite a bit of talking past each otther


Of course your example with eight colours works, too. But as you can
see in the plot, the colours have different order then in the vector
'colors()[(2:9)*10]' itself. I expected the first box (bass2) coloured
"bisque1", the second box (bass1) "blue4" and so on.


Oh. Try putting the fill argument outside the panel and see if the panel
handles it in the manner you expect:

bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3' 'coral3'
'cyan2' 'darkgray' 'darkorange", fill=colors()[(2:11)*10],
panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot( ...)
})
bp3



I hope, this explaination is a bit clearer than my preceding ones.


And I hope my suggestion now "works".


Thank you for the hint, that it works also outside of the panel. It 
looks like I missed the wood for trees here ;-)


In your latest, special case the colours work. After having a nearer 
look at it I found that your colour vector has length 10 (2:11), and 
only the first eight colours are filled in the boxes.


This seems to be reproducable:

### NOT WORKING: 8 colours in the not in order of given vector
bwplot(voice.part ~ height, data = singer,
  main = "NOT THE RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 
'pink' 'violet' 'brown' 'gold'",

  fill=c("yellow","blue","green","red","pink","violet","brown","gold"))

### WORKING: 10 (8+2*NA) colours in order of given vector
bwplot(voice.part ~ height, data = singer,
  main = "RIGHT ORDER OF COLOURS\n'yellow' 'blue' 'green' 'red' 'pink' 
'violet' 'brown' 'gold'",
  fill=c("yellow","blue","green","red","pink","violet","brown","gold", 
NA, NA))


I really do not understand what is going on here,
Rainer


David Winsemius, MD
West Hartford, CT


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stacking effectively - without loops

2010-11-02 Thread David Winsemius



On Nov 2, 2010, at 4:58 PM, Dimitri Liakhovitski wrote:


Never mind - found it: expand.grid(y,x)


Yes, that is one way and is a way that was illustrated yesterday for a  
very similar question on r-help by (perhaps?) Grothendieck). Another  
way is:


data.frame(lets = rep(letters[1:5], each=3), nums=rep(1:3, 5) )

There are at least two different ways that rep() can be invoked and  
each= is not the default.


--
david.



On Tue, Nov 2, 2010 at 4:57 PM, Dimitri Liakhovitski
 wrote:

Hello!

I have 2 vectors:

x<-letters[1:5]
y<-1:3

Is there a way - without loops - to create a data frame such that we
repeat the whole "y" within each level of "x" so that it looks like
this:

a 1
a 2
a 3
b 1
b 2
b 3
c 1
c 2
c 3

etc?

Thank you!

--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com





--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stacking effectively - without loops

2010-11-02 Thread Johannes Huesing

Dimitri Liakhovitski  [Tue, Nov 02, 2010 at 
09:57:04PM CET]:
> Hello!
> 
> I have 2 vectors:
> 
> x<-letters[1:5]
> y<-1:3
> 
> Is there a way - without loops - to create a data frame such that we
> repeat the whole "y" within each level of "x" so that it looks like
> this:
> 
> a 1
> a 2
> a 3
> b 1
> b 2
> b 3
> c 1
> c 2
> c 3
> 
> etc?

?expand.grid


-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:johan...@huesing.name  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, "Life on the Mississippi")

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] stacking effectively - without loops

2010-11-02 Thread Dimitri Liakhovitski

Never mind - found it: expand.grid(y,x)

On Tue, Nov 2, 2010 at 4:57 PM, Dimitri Liakhovitski
 wrote:
> Hello!
>
> I have 2 vectors:
>
> x<-letters[1:5]
> y<-1:3
>
> Is there a way - without loops - to create a data frame such that we
> repeat the whole "y" within each level of "x" so that it looks like
> this:
>
> a 1
> a 2
> a 3
> b 1
> b 2
> b 3
> c 1
> c 2
> c 3
>
> etc?
>
> Thank you!
>
> --
> Dimitri Liakhovitski
> Ninah Consulting
> www.ninah.com
>

-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] stacking effectively - without loops

2010-11-02 Thread Dimitri Liakhovitski

Hello!

I have 2 vectors:

x<-letters[1:5]
y<-1:3

Is there a way - without loops - to create a data frame such that we
repeat the whole "y" within each level of "x" so that it looks like
this:

a 1
a 2
a 3
b 1
b 2
b 3
c 1
c 2
c 3

etc?

Thank you!

-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Hooks into dynamic help system?

2010-11-02 Thread Kevin Wright

What is the easiest way to modify the dynamic html help files?

For example, I would like to put this link on every page:
/doc/html/packages.html
Or a link to the Rseek engine with the title of the page passed as a
parameter to Rseek.
etc.

Do I re-write Rd2HTML and put it higher in the search path?

Are there hook-like functions that I can use to post-process the html code?


Are there parameters for template-type text strings that can be inserted
into the header or footer of the html page?

Any pointers would be appreciated.

-- 
Kevin Wright

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ForestPlot or similar

2010-11-02 Thread Abhijit Dasgupta

You need to use a print statement

print(forestplot())

Lattice and ggplot2 need to be explicitly printed to get output into 
jpeg. I believe Matt's function only provides the graphics object and 
not the printed version.

Abhijit
On 11/2/2010 4:32 PM, Mestat wrote:
> Thanks Matt,
> I am having a problem now to use this function. The function separately
> works fine. But the problem is that I am working with a simulation, so i
> placed the CREDPLOT function in my program and added the following commands
> according my data:
>
> #MY DATA, ESTIMATES, LOWER AND UPPER INTERVALS
> rw_cibas_quantile_ori_m<-rw_quantile_app_ori[-51:-1000]
> rw_cibas_low_quantile_ori_l<-rw_cibas_low_quantile_ori[-51:-1000]
> rw_cibas_up_quantile_ori_u<-rw_cibas_up_quantile_ori[-51:-1000]
>
> #GRAPHIC
> jpeg ('Nfp_rw_bas_quantile_ori.jpeg')
> forestplot(rw_cibas_quantile_ori_m,rw_cibas_low_quantile_ori_l,rw_cibas_up_quantile_ori_u,cen=403.677)
> dev.off()
>
> My program is running fine, but I am not getting any graphic. I did the
> graphic using the function FORESTPLOT, but the graphic provided by the
> function CREDPLOT is much better. Here is my code:
>
> rw_ciper_gini_ori_m<-rw_gini_app_ori[-51:-1000]
> rw_ciper_low_gini_ori_l<-rw_ciper_low_gini_ori[-51:-1000]
> rw_ciper_up_gini_ori_u<-rw_ciper_up_gini_ori[-51:-1000]
> tabletext<-cbind(c(rep(" ",50),NA))
> rw_ciper_gini_ori_m<-c(rw_ciper_gini_ori_m,NA)
> rw_ciper_low_gini_ori_l<-c(rw_ciper_low_gini_ori_l,NA)
> rw_ciper_up_gini_ori_u<-c(rw_ciper_up_gini_ori_u,NA)
> jpeg ('Sfp_rw_per_gini_ori.jpeg')
> forestplot(tabletext,rw_ciper_gini_ori_m,rw_ciper_low_gini_ori_l,rw_ciper_up_gini_ori_u,zero=0.4,col=meta.colors(box="royalblue",line="darkblue"))
> dev.off()
>
> Any information about whats is missing/wrong in order to obtain the graphic
> with the function CREDPLOT is welcomed.
> Thanks is advance,
> Marcio
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] splitting First 10 words in a string

2010-11-02 Thread Matevž Pavlič

Hi Steven, 

 

Thank you for the help. I get an error though when i do this :

 

>lit<-read.csv("litologija.csv", sep=";", dec=".")

>sent <-data.frame(sentence=lit$Opis,stringsAsFactors=FALSE)

>str(sent)

>sentV<-rep(sent,10)

>str(sentV)

 

>first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)

>DF 
><-data.frame(Sentence=sent,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)

 

Â»Error in data.frame(Sentence = sent, first, second, third, fourth, fifth,  : 

arguments imply differing number of rows: 22928, 10Â«

 

What am I doing wrong?

 

Thnks, m

 

 

 

From: steven mosher [mailto:mosherste...@gmail.com] 
Sent: Tuesday, November 02, 2010 8:45 PM
To: David Winsemius
Cc: MatevÅ¾ PavliÄ; Gaj Vidmar; r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string

 

 Thanks david.

 

  Matevz, maybe I can help explain by doing a very simple and brute force 
approach

as opposed to  the way david did it. But you should learn his methods.

 

I will just do a subset of your problem and if you understand how it works then 
you should

be able to get something done and then make it more elegant.

 

First, I simplify the problem by separating out the "sentence" column.

 

You can do this with your data frame by simply doing this

 

MySentence <-data.frame(sentence=yourbigDF$Opis,stringsAsFactors=FALSE)

 

so I take your original data.frame (yourbigDF) and I just create a copy of that 
one column

 $Opis

 

Later we can merge the two back together after I add 10 columns for the words

 

 

Lets make some dummy data with just 10 rows

 

 

 

 sentence<- "this is a sentence with ten words or maybe more than ten words"

 sentV<-rep(sentence,10)

# now I just made 10 rows of the same sentence

# NEXT because I am going to create 10 new colums of 10 rows I create

# 10 vectors> each is named and each has 10 elements For the rows.

# they have NO DATA in them

 

 
first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)

 

#Next I create a dataframe with Sentence in the first column and 10 blank 
colums.

# NOTE I use stringsAsFactors=False

 

 DF 
<-data.frame(Sentence=sentence,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)

 

# This is what it would look like ( the first row)

DF[1,]

 

Sentence first second third fourth fifth sixth seventh eighth ninth tenth

1 this is a sentence with ten words or maybe more than ten words FALSE  FALSE 
FALSE  FALSE FALSE FALSE   FALSE  FALSE FALSE FALSE

 

Next, I will show you how to assign the first ten words to the 10 blank columns

 

DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]

 

#DF[1,2:11]  selects the columns 2-11 of the first row

#strsplit  returns the first 10 words [1:10] and place them in the columsn2-11

 

If you want to do this the slow way you can just loop through your dataframe 
row by row

or you can probably use apply.

 

Make more sense?

> DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]

> DF[1,]

Sentence first second 
third   fourth fifth sixth seventh eighth ninth tenth

1 this is a sentence with ten words or maybe more than ten words  this is   
  a sentence  with   ten   words or maybe  more

> DF[1,"first"]

[1] "this"

 

On Tue, Nov 2, 2010 at 12:22 PM, David Winsemius  wrote:


On Nov 2, 2010, at 3:01 PM, MatevÅ¾ PavliÄ wrote:

Hi all,

Thanks for all the help. I managed to do it with what Gaj suggested (Excel :().

The last solution from David is also freat i just don't undestand why R  put 
the words in 14 columns and thre rows?

 

Because the maximum number of words was 14 and the fill argument was TRUE. 
There were three rows because there were three items in the supplied character 
vector.

 

I would like it to put just the first 10 words in source field to 10 
diefferent destiantion fields, but the same row. And so on...is that possible?

 

I don't know what a destination field might be. Those are not R data types.

This would trim the extra columns (in this example set to those greater than 8) 
by adding a lot of "NULL"'s to the end of a colClasses specification  at 
the expense of a warning message which can be ignored:

> read.table(textConnection(words), fill=T, colClasses = c(rep("character", 8), 
> rep("NULL", 30) ) , stringsAsFactors=FALSE )


  V1V2V3  V4V5V6V7  V8

1   I  have a columnn  with  text  that has

2   I would  like  to split these words  in

3 but  just first ten wordsin   the string.

Warning message:
In read.table(textConnection(words), fill = T, colClasses = c(rep("character",  
:
 cols = 14 != length(data) = 38


If you want to assign the first column to a variable then just:
> first8 <- read.table(textConnection(words), fill=T, colClasses = 
> c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=F

[R] Display of NAs in character columns of a data frame under fix() or edit().

2010-11-02 Thread Rolf Turner


Example:

xxx <- data.frame(x=1:26,y=letters)
xxx$x[c(2,4,6,8)] <- NA
xxx$y[c(1,3,5,7)] <- NA

yyy <- edit(yyy)

The missing values in xxx$y appear as blanks in the spreadsheet window that
appears, whereas the missing values in the numeric column "x" appear as "NA"
(as I would expect).

Is this a bug or a feature?

cheers,

Rolf Turner

P.S.

> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_NZ.UTF-8/en_NZ.UTF-8/C/C/en_NZ.UTF-8/en_NZ.UTF-8

attached base packages:
[1] datasets  utils stats graphics  grDevices methods   base 

other attached packages:
[1] misc_0.0-13 gtools_2.6.2spatstat_1.20-5 deldir_0.0-12  
[5] mgcv_1.6-2  fortunes_1.4-0  MASS_7.3-8 

loaded via a namespace (and not attached):
[1] grid_2.12.0lattice_0.19-13Matrix_0.999375-44 nlme_3.1-97   
[5] tools_2.12.0  

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread David Winsemius



On Nov 2, 2010, at 4:07 PM, Rainer Hurling wrote:

snipped quite a bit of talking past each otther


Of course your example with eight colours works, too. But as you can  
see in the plot, the colours have different order then in the vector  
'colors()[(2:9)*10]' itself. I expected the first box (bass2)  
coloured "bisque1", the second box (bass1) "blue4" and so on.


Oh. Try putting the fill argument outside the panel and see if the  
panel handles it in the manner you expect:


bp3 <- bwplot(voice.part ~ height, data = singer, main = "fill arg  
outside bwplot\n1] 'bisque1' 'blue4' 'burlywood3' 'chartreuse3'  
'coral3' 'cyan2' 'darkgray' 'darkorange", fill=colors()[(2:11)*10],

  panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot( ...)
   })
 bp3



I hope, this explaination is a bit clearer than my preceding ones.


And I hope my suggestion now "works".





Thanks in advance,
Rainer Hurling


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple imputation for nominal data

2010-11-02 Thread John Sorkin

Thank you!
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)>>> Andrew 
Miles  11/2/2010 3:59 PM >>>
There are a couple of packages that do MI, including MI for nominal  
data.  The most recent of these is "mi", but I believe "mice" might do  
it as well.  Both are available on the CRAN, and both have useful  
articles that teach you how to use them.  The citations for these  
articles can be found at the bottom of the help page that appears by  
typing

?mi
OR for mice
?mice

mi is the newer package and has some useful control features, but as  
it is newer it still is under development.

Andrew Miles

On Nov 2, 2010, at 3:38 PM, John Sorkin wrote:

> I am looking for an R function that will run multiple imputation  
> (perhaps fully conditional imputation, MICE, or sequential  
> generalized regression) for non-MVN data, specifically nominal data.  
> My dependent variable is dichotomous, all my predictors are nominal.  
> I have a total of 4,500 subjects, 1/2 of whom are missing the main  
> independent variables. I would appreciate any suggestions that the  
> users of the listserver might have.
> John
>
>
> John David Sorkin M.D., Ph.D.
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:18}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ForestPlot or similar

2010-11-02 Thread Mestat


Thanks Matt,
I am having a problem now to use this function. The function separately
works fine. But the problem is that I am working with a simulation, so i
placed the CREDPLOT function in my program and added the following commands
according my data:

#MY DATA, ESTIMATES, LOWER AND UPPER INTERVALS
rw_cibas_quantile_ori_m<-rw_quantile_app_ori[-51:-1000]
rw_cibas_low_quantile_ori_l<-rw_cibas_low_quantile_ori[-51:-1000]
rw_cibas_up_quantile_ori_u<-rw_cibas_up_quantile_ori[-51:-1000]

#GRAPHIC
jpeg ('Nfp_rw_bas_quantile_ori.jpeg')
forestplot(rw_cibas_quantile_ori_m,rw_cibas_low_quantile_ori_l,rw_cibas_up_quantile_ori_u,cen=403.677)
dev.off()

My program is running fine, but I am not getting any graphic. I did the
graphic using the function FORESTPLOT, but the graphic provided by the
function CREDPLOT is much better. Here is my code:

rw_ciper_gini_ori_m<-rw_gini_app_ori[-51:-1000]
rw_ciper_low_gini_ori_l<-rw_ciper_low_gini_ori[-51:-1000]
rw_ciper_up_gini_ori_u<-rw_ciper_up_gini_ori[-51:-1000]
tabletext<-cbind(c(rep(" ",50),NA))
rw_ciper_gini_ori_m<-c(rw_ciper_gini_ori_m,NA)
rw_ciper_low_gini_ori_l<-c(rw_ciper_low_gini_ori_l,NA)
rw_ciper_up_gini_ori_u<-c(rw_ciper_up_gini_ori_u,NA)
jpeg ('Sfp_rw_per_gini_ori.jpeg')
forestplot(tabletext,rw_ciper_gini_ori_m,rw_ciper_low_gini_ori_l,rw_ciper_up_gini_ori_u,zero=0.4,col=meta.colors(box="royalblue",line="darkblue"))
dev.off()

Any information about whats is missing/wrong in order to obtain the graphic
with the function CREDPLOT is welcomed.
Thanks is advance,
Marcio
-- 
View this message in context: 
http://r.789695.n4.nabble.com/ForestPlot-or-similar-tp3020374p3024354.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] density() function: differences with S-PLUS

2010-11-02 Thread William Dunlap

> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org] On Behalf Of Nicola 
> Sturaro Sommacal (Quantide srl)
> Sent: Tuesday, November 02, 2010 3:05 AM
> To: r-help@r-project.org
> Subject: [R] density() function: differences with S-PLUS
> 
> Hello!
> 
> Someone know what are the difference between R and S-PLUS in 
> the density()
> function?
> 
> For example, I would like to reply this simple S-PLUS code in 
> R, but I don't
> understand which parameter I should modify to get the same results.
> 
> S-PLUS CODE:
> density(1:1000, width = 4)
> 
> R-CODE:
> density(1:1000, bw = 4, window = "g",  n = 50, cut = 0.75)
> 
> I obtain the same x values, but different y values. I try 
> also different
> examples, with different parameter.

I needed to use the to= and from= arguments to get the same
set of x values in R and S+.  E.g.,
  z <- density(x=0, width=3, window="gaussian",
 n=2001, from=-10, to=10, cut=0.75)
gave identical x outputs in R and S+.  By using x=0
you can see the difference in the gaussian-based kernel
used by R and Splus:
  plot(z$x, z$y, pch=".", log="y")
Splus, as its help("density") states", uses a truncated
Gaussian kernel:
  "The "gaussian" window is truncated at 4
  standard deviations (and then scaled
  appropriately to adjust for the truncated
  area)."
R appears to not truncate the Gaussian kernel. 

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com  

> Can you help me?
> 
> Thank you in advance.
> 
> Nicola Sturaro
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] coxph linear.predictors

2010-11-02 Thread Bond, Stephen

Re: 1. X*beta  != linear.predictor.  

The equality is stated in three different help docs, which is misleading, 
especially in light of the way glm is set up. I felt like was wrestling with 
SAS :-)
The relative risk was the original idea behind cox regression, but it can be 
used for many non-relative purposes. If we want to calculate death probability 
in each period, then lp is no longer shift invariant.
 
Re: 2. Survfit is too slow.
It seems that the implementation follows the procedure in the original Cox 
paper, which calls iterative optimization for each death time.
My subjects are mortgages and both the estimation and the prediction samples 
are several hundred thousand. The call appears to recalculate/optimize 
everything even though only the $surv changes. Since each subject belongs to a 
single strata, most of the calculations are redundant.
I am not much of a programmer and could never figure out how to use the R 
profiler, so cannot be exact here, but the simple exponentiation takes no time 
and survfit takes several secs for each subject.
So I did:

survlong <- survfit(modlong) # a single call suffices
bl1 <- c(1,cumsum(survlong$strata)+1)
bl2 <- cumsum(survlong$strata) # get the start and end of each strata
for (jj in 1:nrow(newapp)){

  strat=as.integer(newapp[jj,"termfac"])
  surv <- survlong$surv[(b1[strat]):(b2[strat])] # extract the strata
  risk <- predict(modlong,new=newapp[jj,],type="risk")# it seems there is no
  # optimization here
  newsurv <- surv^risk # we done
... rest of code
}

As a package maintainer, you have to decide whether including any of the above 
and below is useful or users can figure out things on their own. Or maybe 
survfit can be made smart and subsequent calls on the same model will use the 
first call to survfit?? It's your call :-)

Kind regards

Stephen B
-Original Message-
From: Terry Therneau [mailto:thern...@mayo.edu] 
Sent: Thursday, October 28, 2010 6:39 PM
To: Bond, Stephen; David Winsemius
Cc: r-help@r-project.org
Subject: Re: [R] coxph linear.predictors

Gentlemen,
  I read R-news in batch mode so I'm often a day behind.  Let me try to
answer some of the questions.

 1. X*beta  != linear.predictor.  
I'm sorry if the documentation isn't all it could be.  Between the book,
tech report, and help I've written about 400 pages, but this particular
topic isn't yet in it.  The final snipe about being "opaque like SAS"
was really unfair.
The Cox model is a relative risk model, if lp is a linear predictor then
so is lp +c for any constant; they are equally good and equally valid.
The linear.predictor component in a coxph fit is (X-means) * beta.  The
computation exp(lp) occurs multiple times downstream and this keep the
exp function from overflowing when there is something like a Date object
as a predictor.  Adding this constant changes not a single downstream
calcuation.

2. Survfit is too slow.
 I'd like to hear more about this.  My work mostly involves modest data
sets so perhaps I haven't seen it.  Accuracy and maintainability have
been my first worries.

3. Baseline survival.
 Let xbase be a particular set of values for the x covariates (one for
each).  The survival curve for a given xbase is obtained from survfit
   fit <- coxph(
   sfit <- survfit(fit, newdata=xbase)
   chaz <- -log(sfit$surv)  #cumulative hazard
(The xbase vector will need to have variable names for the function to
know which value goes to which of course).

The cumulative hazard for any other subject will be 
   newhaz <- chaz * exp(fit$coef%*% (x-xbase))
There is not a simple transformation of the standard error from one fit
to another, however.  You will need to call survfit with a data frame
for newdata, which will return one curve per row with the proper values.

In my view there is no such thing as "A" baseline survival curve.  Any
xbase you chose is a baseline.  However, it is wise to choose something
near the center of the data space in order to avoid numeric problems
with the exp function above.  I would never ever chose a vector of
zeros, although some text books do -- it saves them about 8 characters
of typing in the newhaz formula above.

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Rainer Hurling

On 02.11.2010 20:08 (UTC+1), David Winsemius wrote:

On Nov 2, 2010, at 2:32 PM, Rainer Hurling wrote:

On 02.11.2010 19:08 (UTC+1), David Winsemius wrote:

On Nov 2, 2010, at 1:19 PM, Rainer Hurling wrote:

Inspired by colouring the dots of box-whisker plots I am trying to
also fill the boxes (rectangles) with different colours. This seems
not to work as I expected.

Looking at the help page of panel.bwplot it says: 'fill - color to
fill the boxplot'. Obviously it is only intended to fill all boxes
with only one colour?

Nevertheless the following example shows, that 'fill' from
panel.bwplot is able to work with more than one colour. But this only
works with one colour or multiples of 5 colours:

-
bp1 <- bwplot(voice.part ~ height, data = singer, main="1 color works",
panel = function(...) {
panel.bwplot(col=c("yellow"),
fill=c("yellow"), ...)
})

bp2 <- bwplot(voice.part ~ height, data = singer, main = "3 colors do
NOT work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green"),
fill=c("yellow","blue","green"), ...)
})

bp3 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green","pink","red"),
fill=c("yellow","blue","green","pink","red"), ...)
})

plot(bp1, split=c(1,1,1,3))
plot(bp2, split=c(1,2,1,3), newpage=FALSE)
plot(bp3, split=c(1,3,1,3), newpage=FALSE)
-

Is there any chance to use more than one filling colour correctly?

Thanks for answering.

You have eight boxes to fill and 8 dots to color. You can either supply
8 distinct colors or you can supply some lesser number and they will be
recycled across the entire 8 boxes and dots. What you cannot do ( and
expect to see the dots against the fill background) is plot the dots as
the same colors as the fill.

It was not my intention to get the dots coloured in the same colour as
the boxes. Instead I am looking for a method to fill the boxes with a
predefined set of different colours (from a color vector). As far as I
can see this is only possible for one colour and multitudes of five
colours.

I think first I have to apologise for my bad english. Sorry for any 
misunderstandig.

Huh? My example used 4 colors. It should have worked with eight colors
as well. There are eight groups and

Yes, all is ok with your example. My only problem is, the these four 
colours are not ordered as given by the vector (see below).

The dots should remain uncoloured ...

Then leave out the col= argument (assuming uncolored means black.)

I used these coloured dots to explain, that ordered colours (from given 
vector) work with dots, but not with the boxes.

This will let you see all colors of dots and fill with only 4 colors
because I set it up so there was no two identical colors in teh sequence
of dots and fill during hte reculing:

bp4 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.bwplot(col=rev(c("yellow","blue","green","pink")),
fill=c("yellow","blue","green","pink"), ...)
})

In your example you can see that the dots colors are painted in the
right (reversed) order, the boxes are painted as sequence
c("yellow","pink","green","blue") instead of
c("yellow","blue","green","pink").

I do not understand how to turn over a given order and with a given
count of colours to the boxes.

See if this example using selected colors() works to make it clearer:

 > colors()[(2:9)*10]
[1] "bisque1" "blue4" "burlywood3" "chartreuse3" "coral3"
[6] "cyan2" "darkgray" "darkorange"

bp5 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot(fill=colors()[(2:9)*10], ...)
})

bp5

(Needed to avoid the first colors() because they were mostly variants of
"white".
 > colors()[1:8]
[1] "white" "aliceblue" "antiquewhite" "antiquewhite1"
[5] "antiquewhite2" "antiquewhite3" "antiquewhite4" "aquamarine"

Of course your example with eight colours works, too. But as you can see 
in the plot, the colours have different order then in the vector 
'colors()[(2:9)*10]' itself. I expected the first box (bass2) coloured 
"bisque1", the second box (bass1) "blue4" and so on.

I hope, this explaination is a bit clearer than my preceding ones.

Thanks in advance,
Rainer Hurling

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple imputation for nominal data

2010-11-02 Thread Andrew Miles

There are a couple of packages that do MI, including MI for nominal  
data.  The most recent of these is "mi", but I believe "mice" might do  
it as well.  Both are available on the CRAN, and both have useful  
articles that teach you how to use them.  The citations for these  
articles can be found at the bottom of the help page that appears by  
typing


?mi
OR for mice
?mice

mi is the newer package and has some useful control features, but as  
it is newer it still is under development.


Andrew Miles


On Nov 2, 2010, at 3:38 PM, John Sorkin wrote:

I am looking for an R function that will run multiple imputation  
(perhaps fully conditional imputation, MICE, or sequential  
generalized regression) for non-MVN data, specifically nominal data.  
My dependent variable is dichotomous, all my predictors are nominal.  
I have a total of 4,500 subjects, 1/2 of whom are missing the main  
independent variables. I would appreciate any suggestions that the  
users of the listserver might have.

John


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped: 
6}}


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] multi-level cox ph with time-dependent covariates

2010-11-02 Thread Mattia Prosperi

Dear all,

I would like to know if it is possible to fit in R a Cox ph model with
time-dependent covariates and to account for hierarchical effects at
the same time. Additionally, I'd like also to know if it would be
possible to perform any feature selection on this model fit.

I have a data set that is composed by multiple marker measurements
(and hundreds of covariates) at different time points from different
tissue samples of different patients. Suppose that the data were
coming from animal model with very few subjects (n=6) that were
followed up given a pathogen exposure, measured several times,
sampling different tissues in the same days, until a certain outcome
was reached (or outcome censored). Suppose that the pathogen can vary
over time (might be a bacteria that selects for drug-resistance) and
that also it can vary across different tissue reservoirs within the
same patient.

In other words: names(data) = patient_id, start_time, stop_time,
tissue_id, pathogen_type, marker1, ..., marker100, ..., outcome

If I had multiple observations per patient at different time
intervals, I would model it like this (hope it is correct)

model<-coxph(Surv(start_time,stop_time,outcome)~all_covariates+cluster(patient_id))

But now I have both the patient and the tissue, and hundreds of
different variables. I thought I could use the coxme library, since it
has also a ridge regression feature. Shall I then model nested random
effects by considering both the patient_id and the tissue_id?

Like model<-coxme(Surv(start_time,stop_time,outcome) ~ covariates + (1
| patient_id/tissue_id))

Then, how could I shrink the coefficients in order to select a subset
of them with non-neglegible effects? May I also consider the
possibility to run an AIC-based forward-backward selection?

thanks and apologies if I am completely out of the trails,

M.P.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spliting first 10 words in a string

2010-11-02 Thread Matevž Pavlič

Hi, 

Ok, i got this now. At least i think so. I got a data.frame with 15 fields, all 
other words have bee truncated. Which is what i want. But ia have that in a 
seperate data.frame from that one it was before (would be nice if it would be 
in the same ...) 

'data.frame':   22801 obs. of  15 variables:
 $ V1 : chr  "HUMUS" "SLABO" "MALO" "SLABO" ...
 $ V2 : chr  "IN" "GRANULIRAN" "PREPEREL" "VEZAN" ...
 $ V3 : chr  "HUMUSNA" "PEŠČEN" "MELJAST" ",KONGLOMERAT," ...
 $ V4 : chr  "GLINA" "PROD" "PROD" "P0ROZEN," ...
 $ V5 : chr  "Z" "DO" "DO" "S" ...
 $ V6 : chr  "MALO" "r" "r" "PLASTMI" ...
 $ V7 : chr  "PODA," "=" "=" "GFs," ...
 $ V8 : chr  "LAHKO" "8Q" "60mm," "SIVORJAV" ...
 $ V9 : chr  "GNETNA," "mm," "S" "" ...
 $ V10: chr  "RJAVA" "S" "PRODNIKI," "" ...
 $ V11: chr  "" "PRODNIKI" "MALO" "" ...
 $ V12: chr  "" "DO" "PEŠČEN" "" ...
 $ V13: chr  "" "R" "S" "" ...
 $ V14: chr  "" "=" "TANKIMI" "" ...

Now, i have another problem. Is it possible to count which word occours most 
often each field (V1, V2, V3, ...) and which one is the second and so on. 
Ideally to create a table for each field (V1, V2, V3, ...) with the word and 
thenumber of occuraces in that field (column) . 
I suppose it could be done in SQL, but what since i saw what R can do i guess 
this can be done here to?

Thanks, m

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net] 
Sent: Tuesday, November 02, 2010 8:23 PM
To: Matevž Pavlič
Cc: Gaj Vidmar; r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string


On Nov 2, 2010, at 3:01 PM, Matevž Pavlič wrote:

> Hi all,
>
> Thanks for all the help. I managed to do it with what Gaj suggested 
> (Excel :().
>
> The last solution from David is also freat i just don't undestand why 
> R  put the words in 14 columns and thre rows?

Because the maximum number of words was 14 and the fill argument was TRUE. 
There were three rows because there were three items in the supplied character 
vector.

> I would like it to put just the first 10 words in source field to 10 
> diefferent destiantion fields, but the same row. And so on...is that 
> possible?

I don't know what a destination field might be. Those are not R data types.

This would trim the extra columns (in this example set to those greater than 8) 
by adding a lot of "NULL"'s to the end of a colClasses specification  at 
the expense of a warning message which can be
ignored:

 > read.table(textConnection(words), fill=T, colClasses = c(rep("character", 
 > 8), rep("NULL", 30) ) , stringsAsFactors=FALSE )
V1V2V3  V4V5V6V7  V8
1   I  have a columnn  with  text  that has
2   I would  like  to split these words  in
3 but  just first ten wordsin   the string.
Warning message:
In read.table(textConnection(words), fill = T, colClasses = c(rep("character",  
:
   cols = 14 != length(data) = 38


If you want to assign the first column to a variable then just:
 > first8 <- read.table(textConnection(words), fill=T, colClasses = 
 > c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE)  > var1 
 > <- first8[[1]]  > var1
[1] "I"   "I"   "but"

--
David.

>
> Thank you, m
> -Original Message-
> From: r-help-boun...@r-project.org 
> [mailto:r-help-boun...@r-project.org
> ] On Behalf Of David Winsemius
> Sent: Tuesday, November 02, 2010 3:47 PM
> To: Gaj Vidmar
> Cc: r-h...@stat.math.ethz.ch
> Subject: Re: [R] spliting first 10 words in a string
>
>
> On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:
>
>> Though  in this list, in Excel it's just (literally!) five 
>> clicks away!
>> (with the column in question selected) Data -> Text to Columns -> 
>> Delimited -> tick Space -> Finish Pa je! (~Voila in Slovenian) (then 
>> import back to R, keeping only the first 10 columns if so
>> desired)
>
> You could do the same thing without needing to leave R. Just 
> read.table( textConnection(..), header=FALSE, fill=TRUE)
>
>> read.table(textConnection(words), fill=T)
>V1V2V3  V4V5V6V7  V8   V9
> V10  V11   V12 V13 V14
> 1   I  have a columnn  with  text  that hasquite
> a  few words  in it.
> 2   I would  like  to split these words  in separate columns
> 3 but  just first ten wordsin   the string.   Isthat
> possiblein  R?
>
>>
>> Regards,
>> Assist. Prof. Gaj Vidmar, PhD
>> University Rehabilitattion Institute, Republic of Slovenia
>>
>> Irrelevant P.S. Long ago, before embarking on what eventually ended 
>> mainly in statistics, I did two years of geology, so (and also 
>> because of knowing what the poster's institute does) I even kinda 
>> imagine what these data are.
>>
>> "Matev¾ Pavliè"  wrote in message 
>> news:ad5ca6183570b54f92aa45ce2619f9b9d96...@gi-zrmk.si...
>>> Hi,
>>>
>>> I am sorry, will try to be more exact from now on...
>>>
>>> I have a data.frame  with a field called Opis. IT contains sentenses 
>>> that I would like to split in words or

Re: [R] visualize TukeyHSD results

2010-11-02 Thread Mendiburu, Felipe (CIP)

Dear Timothy,

Use library(agricolae)
> library(agricolae)
> a = aov(Weight~Feed)
> HSD.test(a,"Feed")

HSD.test(a,"Feed", group=TRUE)
HSD.test(a,"Feed", group=FALSE)

Regards,

Felipe de Mendiburu.
http://tarwi.lamolina.edu.pe/~fmendiburu
International Potato Center. www.cipotato.org
University: Agraria La Molina - Peru. www.lamolina.edu.pe

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Timothy Spier
Sent: Thursday, October 21, 2010 9:50 PM
To: r-help@r-project.org
Subject: [R] visualize TukeyHSD results


I am a new R user but a long time SAS user. I searched for a response to
this question but no luck, so forgive me if this topic has been covered
before. I am running a TukeyHSD post hoc test after running an ANOVA. I
get the results of all pairwise comparisons, no problem. However, the
output table is a little "busy", and I'd like to make the output easier
to read. Specifically, I would like all groups which are not
significantly different to be given the same letter. 

For example, here is a simple ANOVA with Tukey post hoc. It compares
weight gain in pigs among 4 feeds labeled "A", "B", "C", and "D":

> a = aov(Weight~Feed)
> TukeyHSD(a)
  Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = Weight ~ Feed)

$Feed
  difflwr   upr p adj
B-A   6.68   1.096263 12.263737 0.0168421
C-A   8.73   2.807553 14.652447 0.0034914
D-A  -1.38  -6.963737  4.203737 0.8906642
C-B   2.05  -3.872447  7.972447 0.7530266
D-B  -8.06 -13.643737 -2.476263 0.0041505
D-C -10.11 -16.032447 -4.187553 0.0009497



What I really want would look something like this:

Feed Mean TukeyResult
C73.4   a
B71.3   a
A64.6   b
D63.2   b


Any ideas?
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spliting first 10 words in a string

2010-11-02 Thread steven mosher

 Thanks david.

  Matevz, maybe I can help explain by doing a very simple and brute force
approach
as opposed to  the way david did it. But you should learn his methods.

I will just do a subset of your problem and if you understand how it works
then you should
be able to get something done and then make it more elegant.

First, I simplify the problem by separating out the "sentence" column.

You can do this with your data frame by simply doing this

MySentence <-data.frame(sentence=yourbigDF$Opis,stringsAsFactors=FALSE)

so I take your original data.frame (yourbigDF) and I just create a copy of
that one column
 $Opis

Later we can merge the two back together after I add 10 columns for the
words


Lets make some dummy data with just 10 rows



 sentence<- "this is a sentence with ten words or maybe more than ten words"
 sentV<-rep(sentence,10)
# now I just made 10 rows of the same sentence
# NEXT because I am going to create 10 new colums of 10 rows I create
# 10 vectors> each is named and each has 10 elements For the rows.
# they have NO DATA in them

 
first=second=third=fourth=fifth=sixth=seventh=eighth=ninth=tenth<-vector(length=10)

#Next I create a dataframe with Sentence in the first column and 10 blank
colums.
# NOTE I use stringsAsFactors=False

 DF
<-data.frame(Sentence=sentence,first,second,third,fourth,fifth,sixth,seventh,eighth,ninth,tenth,stringsAsFactors=FALSE)

# This is what it would look like ( the first row)
DF[1,]

Sentence first second third fourth fifth sixth seventh eighth ninth tenth
1 this is a sentence with ten words or maybe more than ten words FALSE
 FALSE FALSE  FALSE FALSE FALSE   FALSE  FALSE FALSE FALSE

Next, I will show you how to assign the first ten words to the 10 blank
columns

DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]

#DF[1,2:11]  selects the columns 2-11 of the first row
#strsplit  returns the first 10 words [1:10] and place them in the
columsn2-11

If you want to do this the slow way you can just loop through your dataframe
row by row
or you can probably use apply.

Make more sense?
> DF[1,2:11]<-strsplit(DF[1,1]," ")[[1]][1:10]
> DF[1,]
Sentence first
second third   fourth fifth sixth seventh eighth ninth tenth
1 this is a sentence with ten words or maybe more than ten words  this
is a sentence  with   ten   words or maybe  more
> DF[1,"first"]
[1] "this"

On Tue, Nov 2, 2010 at 12:22 PM, David Winsemius wrote:

>
> On Nov 2, 2010, at 3:01 PM, MatevÅ¾ PavliÄ wrote:
>
>  Hi all,
>>
>> Thanks for all the help. I managed to do it with what Gaj suggested (Excel
>> :().
>>
>> The last solution from David is also freat i just don't undestand why R
>>  put the words in 14 columns and thre rows?
>>
>
> Because the maximum number of words was 14 and the fill argument was TRUE.
> There were three rows because there were three items in the supplied
> character vector.
>
>
>  I would like it to put just the first 10 words in source field to 10
>> diefferent destiantion fields, but the same row. And so on...is that
>> possible?
>>
>
> I don't know what a destination field might be. Those are not R data types.
>
> This would trim the extra columns (in this example set to those greater
> than 8) by adding a lot of "NULL"'s to the end of a colClasses specification
>  at the expense of a warning message which can be ignored:
>
> > read.table(textConnection(words), fill=T, colClasses = c(rep("character",
> 8), rep("NULL", 30) ) , stringsAsFactors=FALSE )
>
>   V1V2V3  V4V5V6V7  V8
> 1   I  have a columnn  with  text  that has
> 2   I would  like  to split these words  in
> 3 but  just first ten wordsin   the string.
> Warning message:
> In read.table(textConnection(words), fill = T, colClasses =
> c(rep("character",  :
>  cols = 14 != length(data) = 38
>
>
> If you want to assign the first column to a variable then just:
> > first8 <- read.table(textConnection(words), fill=T, colClasses =
> c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE)
> > var1 <- first8[[1]]
> > var1
> [1] "I"   "I"   "but"
>
> --
> David.
>
>
>
>> Thank you, m
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
>> On Behalf Of David Winsemius
>> Sent: Tuesday, November 02, 2010 3:47 PM
>> To: Gaj Vidmar
>> Cc: r-h...@stat.math.ethz.ch
>> Subject: Re: [R] spliting first 10 words in a string
>>
>>
>> On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:
>>
>>  Though  in this list, in Excel it's just (literally!)
>>> five clicks
>>> away!
>>> (with the column in question selected)
>>> Data -> Text to Columns -> Delimited -> tick Space -> Finish
>>> Pa je! (~Voila in Slovenian)
>>> (then import back to R, keeping only the first 10 columns if so
>>> desired)
>>>
>>
>> You could do the same thing without needing to leave R. Just
>> read.table( textConnection(..), header=FALSE, fill=TRUE)
>>
>>  read.table(textConnection(words), fi

[R] predict() for plm?

2010-11-02 Thread max . e . brown

Hi,

I have a small N large T panel which I am estimating via plm, with fixed
effects.

Is there any way to get predicted values for a new dataset? (I want to
estimate parameters on a subset of my sample, and then use these to
calculate model-implied values for the whole sample).

Alternatively, is there some way of extracting the fixed effects from
the plm fitted model object (then I can calculate the predicted values
myself)?

Thanks.

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multiple imputation for nominal data

2010-11-02 Thread John Sorkin

I am looking for an R function that will run multiple imputation (perhaps fully 
conditional imputation, MICE, or sequential generalized regression) for non-MVN 
data, specifically nominal data. My dependent variable is dichotomous, all my 
predictors are nominal. I have a total of 4,500 subjects, 1/2 of whom are 
missing the main independent variables. I would appreciate any suggestions that 
the users of the listserver might have.
John


John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spliting first 10 words in a string

2010-11-02 Thread David Winsemius

On Nov 2, 2010, at 3:01 PM, Matevž Pavlič wrote:

Hi all,

Thanks for all the help. I managed to do it with what Gaj suggested  
(Excel :().

The last solution from David is also freat i just don't undestand  
why R  put the words in 14 columns and thre rows?

Because the maximum number of words was 14 and the fill argument was  
TRUE. There were three rows because there were three items in the  
supplied character vector.

I would like it to put just the first 10 words in source field to 10  
diefferent destiantion fields, but the same row. And so on...is that  
possible?

I don't know what a destination field might be. Those are not R data  
types.

This would trim the extra columns (in this example set to those  
greater than 8) by adding a lot of "NULL"'s to the end of a colClasses  
specification  at the expense of a warning message which can be  
ignored:

> read.table(textConnection(words), fill=T, colClasses =  
c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE )

   V1V2V3  V4V5V6V7  V8
1   I  have a columnn  with  text  that has
2   I would  like  to split these words  in
3 but  just first ten wordsin   the string.
Warning message:
In read.table(textConnection(words), fill = T, colClasses =  
c(rep("character",  :

  cols = 14 != length(data) = 38

If you want to assign the first column to a variable then just:
> first8 <- read.table(textConnection(words), fill=T, colClasses =  
c(rep("character", 8), rep("NULL", 30) ) , stringsAsFactors=FALSE)

> var1 <- first8[[1]]
> var1
[1] "I"   "I"   "but"

--
David.

Thank you, m
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
] On Behalf Of David Winsemius

Sent: Tuesday, November 02, 2010 3:47 PM
To: Gaj Vidmar
Cc: r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string

On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:

Though  in this list, in Excel it's just (literally!)
five clicks
away!
(with the column in question selected)
Data -> Text to Columns -> Delimited -> tick Space -> Finish
Pa je! (~Voila in Slovenian)
(then import back to R, keeping only the first 10 columns if so
desired)

You could do the same thing without needing to leave R. Just
read.table( textConnection(..), header=FALSE, fill=TRUE)

read.table(textConnection(words), fill=T)

   V1V2V3  V4V5V6V7  V8   V9
V10  V11   V12 V13 V14
1   I  have a columnn  with  text  that hasquite
a  few words  in it.
2   I would  like  to split these words  in separate columns
3 but  just first ten wordsin   the string.   Isthat
possiblein  R?

Regards,
Assist. Prof. Gaj Vidmar, PhD
University Rehabilitattion Institute, Republic of Slovenia

Irrelevant P.S. Long ago, before embarking on what eventually ended
mainly
in statistics,
I did two years of geology, so (and also because of knowing what the
poster's institute does)
I even kinda imagine what these data are.

"Matev¾ Pavliè"  wrote in message
news:ad5ca6183570b54f92aa45ce2619f9b9d96...@gi-zrmk.si...

Hi,

I am sorry, will try to be more exact from now on...

I have a data.frame  with a field called Opis. IT contains
sentenses that
I would like to split in words or fields in data.frame...when I say
columns I mean as in Excel table. I would like to split "Opis" into
ten
fields from the first ten words in Opis field.
Here is an example of my data.frame.

'data.frame':   22928 obs. of  12 variables:
$ VrtinaID: int  1 1 1 1 2 2 2 2 2 2 ...
$ ZapStev : int  1 2 3 4 1 2 3 4 5 6 ...
$ GlobinaOd   : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
$ GlobinaDo   : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
$ Opis: Factor w/ 12754 levels "","(MIVKA) DROBEN  
MELJAST

PESEK, GOST, SIVORJAV",..: 2060 11588 2477 11660 7539 3182 7884
9123 2500
4756 ...
$ ACklasifikacija : Factor w/ 290 levels "","(CL)","(CL)/(SC)",..:
154 125
101 101 NA 106 125 80 106 101 ...
$ GeolNastOd  : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
$ GeolNastDo  : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
$ GeolNastOpis: Factor w/ 113 levels "","B. M. S.",..: 56 53 53
53 56
53 53 53 53 53 ...
$ NacinVrtanjaOd  : num  0e+00 1e+09 1e+09 1e+09 0e+00 ...
$ NacinVrtanjaDo  : num  1.1e+01 1.0e+09 1.0e+09 1.0e+09 1.0e+01 ...
$ NacinVrtanjaOpis: Factor w/ 43 levels "","H. N.","IZKOP",..: 26 1
1 1 26
1 1 1 1 1 ...

Hope that explains better...
Thank you, m

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Monday, November 01, 2010 10:13 PM
To: Matev¾ Pavliè
Cc: r-help@r-project.org
Subject: Re: [R] spliting first 10 words in a string

On Nov 1, 2010, at 4:39 PM, Matev¾ Pavliè wrote:

Hi all,

I have a columnn with text that has quite a few words in it. I  
would

like to split these words in separate columns, but just first ten
words in the string. Is that possible

Re: [R] object ".trPaths" not found

2010-11-02 Thread Spackenkasper


I had the problem as well.
It seems that the reason was that Windows doesn't allow "ordinary"
administrators to edit files in the installation drive C: . 
So I - and Tinn-R - couldn't edit the files in the R directory "etc".

You can circumvent it by restarting your system with User Account Control
(Benutzerkontensteuerung) switched off. Edit the file 

/etc/Rconfig.site 

as described above and it should run. 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/object-trPaths-not-found-tp896933p3024219.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread David Winsemius

On Nov 2, 2010, at 2:32 PM, Rainer Hurling wrote:

On 02.11.2010 19:08 (UTC+1), David Winsemius wrote:

On Nov 2, 2010, at 1:19 PM, Rainer Hurling wrote:

Inspired by colouring the dots of box-whisker plots I am trying to
also fill the boxes (rectangles) with different colours. This seems
not to work as I expected.

Looking at the help page of panel.bwplot it says: 'fill - color to
fill the boxplot'. Obviously it is only intended to fill all boxes
with only one colour?

Nevertheless the following example shows, that 'fill' from
panel.bwplot is able to work with more than one colour. But this  
only

works with one colour or multiples of 5 colours:

-
bp1 <- bwplot(voice.part ~ height, data = singer, main="1 color  
works",

panel = function(...) {
panel.bwplot(col=c("yellow"),
fill=c("yellow"), ...)
})

bp2 <- bwplot(voice.part ~ height, data = singer, main = "3 colors  
do

NOT work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green"),
fill=c("yellow","blue","green"), ...)
})

bp3 <- bwplot(voice.part ~ height, data = singer, main = "5 colors  
do

work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green","pink","red"),
fill=c("yellow","blue","green","pink","red"), ...)
})

plot(bp1, split=c(1,1,1,3))
plot(bp2, split=c(1,2,1,3), newpage=FALSE)
plot(bp3, split=c(1,3,1,3), newpage=FALSE)
-

Is there any chance to use more than one filling colour correctly?

Thanks for answering.

You have eight boxes to fill and 8 dots to color. You can either  
supply
8 distinct colors or you can supply some lesser number and they  
will be

recycled across the entire 8 boxes and dots. What you cannot do ( and
expect to see the dots against the fill background) is plot the  
dots as

the same colors as the fill.

It was not my intention to get the dots coloured in the same colour  
as the boxes. Instead I am looking for a method to fill the boxes  
with a predefined set of different colours (from a color vector). As  
far as I can see this is only possible for one colour and multitudes  
of five colours.

Huh? My example used 4 colors. It should have worked with eight colors  
as well. There are eight groups and

The dots should remain uncoloured ...

Then leave out the col= argument (assuming uncolored means black.)

This will let you see all colors of dots and fill with only 4 colors
because I set it up so there was no two identical colors in teh  
sequence

of dots and fill during hte reculing:

bp4 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.bwplot(col=rev(c("yellow","blue","green","pink")),
fill=c("yellow","blue","green","pink"), ...)
})

In your example you can see that the dots colors are painted in the  
right (reversed) order, the boxes are painted as sequence  
c("yellow","pink","green","blue") instead of  
c("yellow","blue","green","pink").

I do not understand how to turn over a given order and with a given  
count of colours to the boxes.

See if this example using selected colors() works to make it clearer:

> colors()[(2:9)*10]
[1] "bisque1" "blue4"   "burlywood3"  "chartreuse3" "coral3"
[6] "cyan2"   "darkgray""darkorange"

bp5 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do  
work",

 panel = function(...) {
   panel.grid(v = -1, h = 0)
   panel.bwplot(fill=colors()[(2:9)*10], ...)
  })

bp5

(Needed to avoid the first colors() because they were mostly variants  
of "white".

> colors()[1:8]
[1] "white" "aliceblue" "antiquewhite"  "antiquewhite1"
[5] "antiquewhite2" "antiquewhite3" "antiquewhite4" "aquamarine"

Thanks in advance,
Rainer Hurling

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] multicore package: help

2010-11-02 Thread Patrick Connolly

On Mon, 01-Nov-2010 at 06:10PM -0400, Fahim M wrote:

|> I have matrices as below:
|> 
|> a <- matrix(c(1:10, 11, 12), 3,4)
|> aa <- data.frame(a)
|> 
|> b <- matrix(c(10:20, 21), 4,3)
|> bb <- data.frame(b)
|> ...
|> and many more matrices.
|> 
|> st = list(aa,bb, . )

There's probably a tidier way to do it, but without knowing what sort
of thing you want to do, I probably don't have the best way of doing
it, but the following should help you.

You don't need a list in this case.  A simple vector will suffice.
 
st <- c("aa", "bb", etc...)


|> 
|> mclapply(st, FUN, mc.cores=6); #this function apply the function to the
|> elements of the list 'aa', 'bb'...etc
|> 
|> 
|> FUN = function(st)

Use something different from st.  Whatever you call it will be the
individual values.

FUN <- function(x){
ind <- which(st == x) # which is the index you want.
mat.x <- get(x) # which will be the dataframe for that part of your list.

... etc...

}

then assign the output of mclapply to a list

out.list <-  mclapply(st, FUN, mc.cores=6)

You'll probably find it useful to name its elements like this:

names(out.list) <- st




HTH




|>  {
|>  Is there a way/function to know the index of st(the list) currently
|> processed by this function as these matrices  are processed in the order of
|> availability of processors?
|> for example, if matrix bb is being processed then the index that I want is
|> 2.
|>  ...
|>  ...
|>  ...
|> 
|>  }
|> 
|>  [[alternative HTML version deleted]]
|> 
|> __
|> R-help@r-project.org mailing list
|> https://stat.ethz.ch/mailman/listinfo/r-help
|> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
|> and provide commented, minimal, self-contained, reproducible code.

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~}   Great minds discuss ideas
 _( Y )_ Average minds discuss events 
(:_~*~_:)  Small minds discuss people  
 (_)-(_)  . Eleanor Roosevelt
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spliting first 10 words in a string

2010-11-02 Thread Matevž Pavlič

Hi all,  

Thanks for all the help. I managed to do it with what Gaj suggested (Excel :(). 

The last solution from David is also freat i just don't undestand why R  put 
the words in 14 columns and thre rows? I would like it to put just the first 10 
words in source field to 10 diefferent destiantion fields, but the same row. 
And so on...is that possible?

Thank you, m
-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of David Winsemius
Sent: Tuesday, November 02, 2010 3:47 PM
To: Gaj Vidmar
Cc: r-h...@stat.math.ethz.ch
Subject: Re: [R] spliting first 10 words in a string


On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:

> Though  in this list, in Excel it's just (literally!)  
> five clicks
> away!
> (with the column in question selected)
> Data -> Text to Columns -> Delimited -> tick Space -> Finish
> Pa je! (~Voila in Slovenian)
> (then import back to R, keeping only the first 10 columns if so  
> desired)

You could do the same thing without needing to leave R. Just  
read.table( textConnection(..), header=FALSE, fill=TRUE)

 > read.table(textConnection(words), fill=T)
V1V2V3  V4V5V6V7  V8   V9  
V10  V11   V12 V13 V14
1   I  have a columnn  with  text  that hasquite
a  few words  in it.
2   I would  like  to split these words  in separate columns
3 but  just first ten wordsin   the string.   Isthat  
possiblein  R?

>
> Regards,
> Assist. Prof. Gaj Vidmar, PhD
> University Rehabilitattion Institute, Republic of Slovenia
>
> Irrelevant P.S. Long ago, before embarking on what eventually ended  
> mainly
> in statistics,
> I did two years of geology, so (and also because of knowing what the
> poster's institute does)
> I even kinda imagine what these data are.
>
> "Matev¾ Pavliè"  wrote in message
> news:ad5ca6183570b54f92aa45ce2619f9b9d96...@gi-zrmk.si...
>> Hi,
>>
>> I am sorry, will try to be more exact from now on...
>>
>> I have a data.frame  with a field called Opis. IT contains  
>> sentenses that
>> I would like to split in words or fields in data.frame...when I say
>> columns I mean as in Excel table. I would like to split "Opis" into  
>> ten
>> fields from the first ten words in Opis field.
>> Here is an example of my data.frame.
>>
>> 'data.frame':   22928 obs. of  12 variables:
>> $ VrtinaID: int  1 1 1 1 2 2 2 2 2 2 ...
>> $ ZapStev : int  1 2 3 4 1 2 3 4 5 6 ...
>> $ GlobinaOd   : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
>> $ GlobinaDo   : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
>> $ Opis: Factor w/ 12754 levels "","(MIVKA) DROBEN MELJAST
>> PESEK, GOST, SIVORJAV",..: 2060 11588 2477 11660 7539 3182 7884  
>> 9123 2500
>> 4756 ...
>> $ ACklasifikacija : Factor w/ 290 levels "","(CL)","(CL)/(SC)",..:  
>> 154 125
>> 101 101 NA 106 125 80 106 101 ...
>> $ GeolNastOd  : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
>> $ GeolNastDo  : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
>> $ GeolNastOpis: Factor w/ 113 levels "","B. M. S.",..: 56 53 53  
>> 53 56
>> 53 53 53 53 53 ...
>> $ NacinVrtanjaOd  : num  0e+00 1e+09 1e+09 1e+09 0e+00 ...
>> $ NacinVrtanjaDo  : num  1.1e+01 1.0e+09 1.0e+09 1.0e+09 1.0e+01 ...
>> $ NacinVrtanjaOpis: Factor w/ 43 levels "","H. N.","IZKOP",..: 26 1  
>> 1 1 26
>> 1 1 1 1 1 ...
>>
>> Hope that explains better...
>> Thank you, m
>>
>> -Original Message-
>> From: David Winsemius [mailto:dwinsem...@comcast.net]
>> Sent: Monday, November 01, 2010 10:13 PM
>> To: Matev¾ Pavliè
>> Cc: r-help@r-project.org
>> Subject: Re: [R] spliting first 10 words in a string
>>
>>
>> On Nov 1, 2010, at 4:39 PM, Matev¾ Pavliè wrote:
>>
>>> Hi all,
>>>
>>>
>>>
>>> I have a columnn with text that has quite a few words in it. I would
>>> like to split these words in separate columns, but just first ten
>>> words in the string. Is that possible in R?
>>>
>>>
>>
>> Not sure what a column means to you. It's not a precisely defined R
>> type or class. (And you are requested to offered a concrete example
>> rather than making us guess.)
>>
>>> words <-"I have a columnn with text that has quite a few words in
>> it. I would like to split these words in separate columns, but just
>> first ten words in the string. Is that possible in R?"
>>
>>> strsplit(words, " ")[[1]][1:10]
>> [1] "I"   "have""a"   "columnn" "with""text"
>> "that""has" "quite"   "a"
>>
>>
>> Or if in a dataframe:
>>
>>> words <-c("I have a columnn with text that has quite a few words in
>> it.",   "I would like to split these words in separate columns", "but
>> just first ten words in the string. Is that possible in R?")
>>> worddf <- data.frame(words=words)
>>
>>> t(sapply(strsplit(worddf$words, " "), "[", 1:10) )
>> [,1]  [,2][,3][,4]  [,5][,6][,7][,
>> 8]  [,9]   [,10]
>> [1,] "I"   "have"  "a" "columnn" "with"  "text"  "that"  "has"
>

Re: [R] R script on linux?

2010-11-02 Thread Jonathan P Daily

Alternatively, you can simply prefix all scripts with

#! /path/to/R/Rscript
...

where the path is usually /usr/bin/
This info is in the manual that comes packaged with R under, conveniently, 
the scripting section. I assumed that he was getting some error message.
Likely, the script was not created executable, in which case the terminal 
command chmod +x "myscript.R" will do it.
--
Jonathan P. Daily
Technician - USGS Leetown Science Center
11649 Leetown Road
Kearneysville WV, 25430
(304) 724-4480
"Is the room still a room when its empty? Does the room,
 the thing itself have purpose? Or do we, what's the word... imbue it."
 - Jubal Early, Firefly



From:
Thomas Levine 
To:
Jonathan P Daily 
Cc:
gokhanocakoglu , r-help@r-project.org, 
r-help-boun...@r-project.org
Date:
11/02/2010 02:24 PM
Subject:
Re: [R] R script on linux?



Open a terminal, then run these two commands.

cd /home/the/directory/with/your/script
R

Then run this in R

source('yourscript.R')

Tom

2010/11/2 Jonathan P Daily :
> What is the error message?
> --
> Jonathan P. Daily
> Technician - USGS Leetown Science Center
> 11649 Leetown Road
> Kearneysville WV, 25430
> (304) 724-4480
> "Is the room still a room when its empty? Does the room,
>  the thing itself have purpose? Or do we, what's the word... imbue it."
> - Jubal Early, Firefly
>
>
>
> From:
> gokhanocakoglu 
> To:
> r-help@r-project.org
> Date:
> 11/02/2010 09:11 AM
> Subject:
> Re: [R] R  script on linux?
> Sent by:
> r-help-boun...@r-project.org
>
>
>
>
> I can't run the script the program doesn't work...
> --
> View this message in context:
> http://r.789695.n4.nabble.com/R-script-on-linux-tp3023650p3023670.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Rainer Hurling


On 02.11.2010 19:08 (UTC+1), David Winsemius wrote:

On Nov 2, 2010, at 1:19 PM, Rainer Hurling wrote:


Inspired by colouring the dots of box-whisker plots I am trying to
also fill the boxes (rectangles) with different colours. This seems
not to work as I expected.

Looking at the help page of panel.bwplot it says: 'fill - color to
fill the boxplot'. Obviously it is only intended to fill all boxes
with only one colour?

Nevertheless the following example shows, that 'fill' from
panel.bwplot is able to work with more than one colour. But this only
works with one colour or multiples of 5 colours:


-
bp1 <- bwplot(voice.part ~ height, data = singer, main="1 color works",
panel = function(...) {
panel.bwplot(col=c("yellow"),
fill=c("yellow"), ...)
})

bp2 <- bwplot(voice.part ~ height, data = singer, main = "3 colors do
NOT work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green"),
fill=c("yellow","blue","green"), ...)
})

bp3 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.bwplot(col=c("yellow","blue","green","pink","red"),
fill=c("yellow","blue","green","pink","red"), ...)
})

plot(bp1, split=c(1,1,1,3))
plot(bp2, split=c(1,2,1,3), newpage=FALSE)
plot(bp3, split=c(1,3,1,3), newpage=FALSE)
-

Is there any chance to use more than one filling colour correctly?




Thanks for answering.


You have eight boxes to fill and 8 dots to color. You can either supply
8 distinct colors or you can supply some lesser number and they will be
recycled across the entire 8 boxes and dots. What you cannot do ( and
expect to see the dots against the fill background) is plot the dots as
the same colors as the fill.


It was not my intention to get the dots coloured in the same colour as 
the boxes. Instead I am looking for a method to fill the boxes with a 
predefined set of different colours (from a color vector). As far as I 
can see this is only possible for one colour and multitudes of five colours.


The dots should remain uncoloured ...


This will let you see all colors of dots and fill with only 4 colors
because I set it up so there was no two identical colors in teh sequence
of dots and fill during hte reculing:

bp4 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do
work",
panel = function(...) {
panel.bwplot(col=rev(c("yellow","blue","green","pink")),
fill=c("yellow","blue","green","pink"), ...)
})


In your example you can see that the dots colors are painted in the 
right (reversed) order, the boxes are painted as sequence 
c("yellow","pink","green","blue") instead of 
c("yellow","blue","green","pink").


I do not understand how to turn over a given order and with a given 
count of colours to the boxes.




Thanks in advance,
Rainer Hurling


David Winsemius, MD
West Hartford, CT


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R script on linux?

2010-11-02 Thread Thomas Levine

Open a terminal, then run these two commands.

cd /home/the/directory/with/your/script
R

Then run this in R

source('yourscript.R')

Tom

2010/11/2 Jonathan P Daily :
> What is the error message?
> --
> Jonathan P. Daily
> Technician - USGS Leetown Science Center
> 11649 Leetown Road
> Kearneysville WV, 25430
> (304) 724-4480
> "Is the room still a room when its empty? Does the room,
>  the thing itself have purpose? Or do we, what's the word... imbue it."
>     - Jubal Early, Firefly
>
>
>
> From:
> gokhanocakoglu 
> To:
> r-help@r-project.org
> Date:
> 11/02/2010 09:11 AM
> Subject:
> Re: [R] R  script on linux?
> Sent by:
> r-help-boun...@r-project.org
>
>
>
>
> I can't run the script the program doesn't work...
> --
> View this message in context:
> http://r.789695.n4.nabble.com/R-script-on-linux-tp3023650p3023670.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Relsurv package

2010-11-02 Thread David Winsemius



On Nov 2, 2010, at 12:29 PM, Laurence Lauvier wrote:



Hello,
I have a question about relsurv package particularly rsadd function:
Rsadd(Surv(time,cens)~sex 
+ 
ratetable 
(age 
= 
age 
*365.24,sex=sex,year=year),data=,=ratetable=,int=5,method=”max.lik”).
In the tutorial, it is indicated that "the age and year must be  
given in the
date format, i.e. in number of days since 01.01.1960".  
Nethertheless, in

Pohar’s article,
http://ibmi.mf.uni-lj.si/ibmi/biostat-center/predtiski/CMPB_Pohar_Stare_relsurv.pdf 
,

there is no indication about that. What is the true way to use this
function.
Thanks for your help,


I seem to remember an almost identical question on rhelp from some  
months ago. (I remember because I looked at the article and the  
package documentation at the time.) Have you contacted the authors at  
any point?


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] timeSequence

2010-11-02 Thread Carla Leal Kaymalyz

Hello,
I have time series data whose days are not consecutive. I used timeSequence
(from = "4/1/2010", to = "31/12/2010", format = "% Y-% m-% d", FinCenter =
"GMT") to generate a vector of consecutive days , however I need not
consider Sunday as well join this vector to a data frame containing
incomplete dates, for example

DaysX1X2
day 1   10 20
day 23050
day 440 45
day 545 35
day 620 10

Then the above I add the day 3, so I look like this:

DaysX1X2
day 1   10 20
day 23050
day 3NA   NA
day 440 45
day 545 35
day 620 10

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread David Winsemius



On Nov 2, 2010, at 1:19 PM, Rainer Hurling wrote:

Inspired by colouring the dots of box-whisker plots I am trying to  
also fill the boxes (rectangles) with different colours. This seems  
not to work as I expected.


Looking at the help page of panel.bwplot it says: 'fill - color to  
fill the boxplot'. Obviously it is only intended to fill all boxes  
with only one colour?


Nevertheless the following example shows, that 'fill' from  
panel.bwplot is able to work with more than one colour. But this  
only works with one colour or multiples of 5 colours:



-
bp1 <- bwplot(voice.part ~ height, data = singer, main="1 color  
works",

 panel = function(...) {
   panel.bwplot(col=c("yellow"),
fill=c("yellow"), ...)
 })

bp2 <- bwplot(voice.part ~ height, data = singer, main = "3 colors  
do NOT work",

 panel = function(...) {
   panel.grid(v = -1, h = 0)
   panel.bwplot(col=c("yellow","blue","green"),
fill=c("yellow","blue","green"), ...)
 })

bp3 <- bwplot(voice.part ~ height, data = singer, main = "5 colors  
do work",

 panel = function(...) {
   panel.grid(v = -1, h = 0)

panel.bwplot(col=c("yellow","blue","green","pink","red"),

fill=c("yellow","blue","green","pink","red"), ...)
  })

plot(bp1, split=c(1,1,1,3))
plot(bp2, split=c(1,2,1,3), newpage=FALSE)
plot(bp3, split=c(1,3,1,3), newpage=FALSE)
-


Is there any chance to use more than one filling colour correctly?



You have eight boxes to fill and 8 dots to color. You can either  
supply 8 distinct colors or you can supply some lesser number and they  
will be recycled across the entire 8 boxes and dots. What you cannot  
do ( and expect to see the dots against the fill background) is plot  
the dots as the same colors as the fill.


This will let you see all colors of dots and fill with only 4 colors  
because I set it up so there was no two identical colors in teh  
sequence of dots and fill during hte reculing:


bp4 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do  
work",

  panel = function(...) {
panel.grid(v = -1, h = 0)
 
panel.bwplot(col=rev(c("yellow","blue","green","pink")),

 fill=c("yellow","blue","green","pink"), ...)
   })
 bp3




Thanks in advance,
Rainer Hurling

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] One question on heatmap

2010-11-02 Thread Peter Langfelder

Before plotting a heatmap we usually standardize all genes to mean
zero and variance 1. That way the green/red represent under/over
expression with respect to the mean expression, which is roughly what
the original 2-color arrays (that literally produced such heatmaps)
were measuring. Of course, standardization assumes you have more than
2 samples.

If your expression matrix is stored in the variable expr, with columns
corresponding to samples and rows to genes, you can obtain a
standardized expression matrix as

stdExpr = t(scale(t(expr)))


Peter


On Tue, Nov 2, 2010 at 8:50 AM, Hua  wrote:
> Dear R-helper:
>
> Suppose we have a matrix:
>
> Gene            sample1 sample2
>
> Gcnt1            12.    52.8
> Max               8.8000    39.1
> Tmem176b         67.9000   304.7
> Shmt2             8.6000    42.4
> Rtn4             11.5000    57.7
> Il17re            7.6000    38.8
> Bclp2             6.2000    32.1
> Mobkl3            4.4000    32.2
> Akr1b10           3.4000    30.1
> Atp6ap2           5.4000    48.2
> Snx2              5.7000    63.1
> Tmem176a          7.6000    91.4
> Klhl9             1.7000    30.3
> Fbxo27            1.    28.9
> Scd1             34.6000     0.7
> Tspan9           35.8000     4.2
> 2210016L21Rik    39.1000     4.9
> Ctnnb1          212.1000    33.1
> Apoe            397.2000    74.2
> H2-DMb1          72.3000    14.1
> Ryk              31.7000     6.4
> Dapk2            85.4000    17.3
> Gzmm            179.4000    36.8
> Actb          12993.4000  2678.1
> Faim3           758.   157.6
> Aktip           209.4000    46.0
> Tbrg1            93.3000    21.3
>
> When I try to make heatmap based on this gene expression value table, I found 
> that, when I set 'scale' to 'column', the heatmap will be always be red. I 
> think this is because, there's very large values in the matrix (gene Actb), 
> while the most are just very small. Thus, the color will be very ugly. I just 
> wonder, how to set the color to make the heatmap look better?  I have tried 
> log-tranformation on the matrix and it's better now. But I do want to know if 
> you have better ways to set the color span manually and make the heatmap look 
> better without any log-transformation?
>
> Thanks in advance!
>
> Best, Hua
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] One question on heatmap

2010-11-02 Thread Hua

Dear R-helper:

Suppose we have a matrix:

Genesample1 sample2

Gcnt112.52.8
Max   8.800039.1
Tmem176b 67.9000   304.7
Shmt2 8.600042.4
Rtn4 11.500057.7
Il17re7.600038.8
Bclp2 6.200032.1
Mobkl34.400032.2
Akr1b10   3.400030.1
Atp6ap2   5.400048.2
Snx2  5.700063.1
Tmem176a  7.600091.4
Klhl9 1.700030.3
Fbxo271.28.9
Scd1 34.6000 0.7
Tspan9   35.8000 4.2
2210016L21Rik39.1000 4.9
Ctnnb1  212.100033.1
Apoe397.200074.2
H2-DMb1  72.300014.1
Ryk  31.7000 6.4
Dapk285.400017.3
Gzmm179.400036.8
Actb  12993.4000  2678.1
Faim3   758.   157.6
Aktip   209.400046.0
Tbrg193.300021.3

When I try to make heatmap based on this gene expression value table, I found 
that, when I set 'scale' to 'column', the heatmap will be always be red. I 
think this is because, there's very large values in the matrix (gene Actb), 
while the most are just very small. Thus, the color will be very ugly. I just 
wonder, how to set the color to make the heatmap look better?  I have tried 
log-tranformation on the matrix and it's better now. But I do want to know if 
you have better ways to set the color span manually and make the heatmap look 
better without any log-transformation? 

Thanks in advance!

Best, Hua 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Relsurv package

2010-11-02 Thread Laurence Lauvier


Hello, 
I have a question about relsurv package particularly rsadd function: 
Rsadd(Surv(time,cens)~sex+ratetable(age=age*365.24,sex=sex,year=year),data=,=ratetable=,int=5,method=”max.lik”).
 
In the tutorial, it is indicated that "the age and year must be given in the
date format, i.e. in number of days since 01.01.1960". Nethertheless, in
Pohar’s article,
http://ibmi.mf.uni-lj.si/ibmi/biostat-center/predtiski/CMPB_Pohar_Stare_relsurv.pdf,
there is no indication about that. What is the true way to use this
function. 
Thanks for your help, 

Laurence 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Relsurv-package-tp3023956p3023956.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] cov.mve error

2010-11-02 Thread Marino Taussig De Bodonia, Agnese

Hello,

I am trying to use the cov.mve function on a set of variables to check for 
outliers, before I perform PCA on them. I am using the code that I found on 
"Everitt (2005) An R ans S-Plus companion to multivariate analysis" but its 
doesn't seem to work. I wrote:

at.central<-central[,7:17]   # 7:17 are the 10 variables that I want to 
screen for outliers
at.central.mve<-cov.mve(central, cor=T)

I also tried what the help file says to do:

at.central.mve<-cov.rob(central, cor=T, method="mve")

Both give me the error:

Error in quantile.default(as.numeric(x), c(0.25, 0.75), na.rm = na.rm,  :  
missing values and NaN's not allowed if 'na.rm' is FALSE
In addition: Warning message: In quantile(as.numeric(x), c(0.25, 0.75), na.rm = 
na.rm, names = FALSE) : NAs introduced by coercion

My dataset has no NAs, so what does this mean?? Something to do with the 
"quantile.used=" argument?

Thanks in advance for your time,

Agnese
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Please help me about Monte Carlo Permutation

2010-11-02 Thread Łukasz Ręcławowicz

2010/11/2 Chitra 

>
>  yes
>
> >
> >
> > Me too. So you want to do a MC test for Pearson's product-moment
> > correlation, right...?
>
>

So for sample sizes from 3 to about 10 we can use all permutations
[permn(combinat)]- test will be exact! (In our case 7!=5040)

lg<-"lightgreen"
g<-"green"
dg<-"darkgreen"
plot((gamma(1:31)),t="p",main="Suggested tests for r",ylab="Number of
permutations",xlab="n",lwd=2,col=c(rep(lg,10),rep(g,4),rep(dg,17)),log="yx",pch="-",cex=2)
legend(1,range(gamma(4:31))[2],c("exact","MC","cor.test"),col=c(lg,g,dg),pch="-",pt.cex=2)
abline(h=.Machine$integer.max,col=2,lty=3)

We use MC when the number of permutations is very large and we cannot use
them all. Beside, the difference between theoretical distribution for larger
samples >25 will be negligible.
Let's use your data:
> data
Qtot Itot
1 73 684
2 64 451
3 71 378
4 65 284
5 47 179
6 31 117
7 19 69

We get 0.01494540 from cor.test

> cor.test(data[,1],data[,2])

Let's write a function for our test, it might be something like:

cor.test.mc<-function(x,y,n=1e3){
our.data<-cbind(x,y)
if (!is.numeric(our.data[,1]) || !is.numeric(our.data[,2]))
stop("Only numeric variables are allowed.")
l<-length(our.data[,1])
if (l < 3)
stop("At least 3 samples are required.")
DNAME <- paste(deparse(substitute(x)), "and" ,deparse(substitute(y)))
samples<-unique(t(replicate(n,(sample(our.data[,1])
loop<-dim(samples)[1]
correlations<-rep(NA,loop)
for(i in 1:loop){
correlations[i]<-cor(our.data[,2],samples[i,])
}
observed<-cor(our.data[,1],our.data[,2])
GE<-sum(correlations>=observed)
LT<-sum(correlations<(-observed))
two.tailed.p<-(GE+LT)/loop
rea<-(loop/gamma(l+1))*100
RVAL <- list(statistic = c(r = observed), p.value = two.tailed.p, method =
"Monte Carlo Pearson's r test" ,
data.name = DNAME,samples=c(" Number of used unique
permutations"=loop),total=c("Percent of all possible
permutations"=round(rea,2)))
class(RVAL) <- "htest"
#But what kind of plot you wish to have - I don't know...
#hist(correlations,col="blue",xlab="r",xlim=c(-1,1),breaks=50)
return(RVAL)
gc()
}
cor.test.mc(data[,1],data[,2])
test<-cor.test.mc(data[,1],data[,2],6e4)
test
test$samples
test$total
#

And that's it. Our p-value is sum of 7/5040 (GE) and 61/5040 (LT).
You may also take a look @ library(MChtest).
Hope this helps!


-- 
Mi³ego dnia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Colour filling in panel.bwplot from lattice

2010-11-02 Thread Rainer Hurling

Inspired by colouring the dots of box-whisker plots I am trying to also 
fill the boxes (rectangles) with different colours. This seems not to 
work as I expected.


Looking at the help page of panel.bwplot it says: 'fill - color to fill 
the boxplot'. Obviously it is only intended to fill all boxes with only 
one colour?


Nevertheless the following example shows, that 'fill' from panel.bwplot 
is able to work with more than one colour. But this only works with one 
colour or multiples of 5 colours:



-
bp1 <- bwplot(voice.part ~ height, data = singer, main="1 color works",
  panel = function(...) {
panel.bwplot(col=c("yellow"),
 fill=c("yellow"), ...)
  })

bp2 <- bwplot(voice.part ~ height, data = singer, main = "3 colors do 
NOT work",

  panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot(col=c("yellow","blue","green"),
 fill=c("yellow","blue","green"), ...)
  })

bp3 <- bwplot(voice.part ~ height, data = singer, main = "5 colors do work",
  panel = function(...) {
panel.grid(v = -1, h = 0)
panel.bwplot(col=c("yellow","blue","green","pink","red"),

fill=c("yellow","blue","green","pink","red"), ...)
   })

plot(bp1, split=c(1,1,1,3))
plot(bp2, split=c(1,2,1,3), newpage=FALSE)
plot(bp3, split=c(1,3,1,3), newpage=FALSE)
-


Is there any chance to use more than one filling colour correctly?

Thanks in advance,
Rainer Hurling

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Setting the names of a data.frame

2010-11-02 Thread Ivan Calandra


There is also a "header" argument in readHTMLtable()
About the file itself, can't you just erase the introductory text? There 
is also a skip argument to read.table() that might help you.
It's fine if my solution works, but I think it's still safer/easier to 
import the file directly with the correct headers

Ivan



Le 11/2/2010 17:51, Santosh Srinivas a écrit :

It is just read from a file that has introductory text in the beginning and
a the header starts slight below  so couldn’t use header as such.
I just modified that dataset to ignore the earlier lines ... sHeaders =
tData[4,]&   tData = tData [5:end]

The original data was actually a readHTMLtable from a webpage.

Your solutions works well enough for my purpose ... thanks.


-Original Message-
From: Ivan Calandra [mailto:ivan.calan...@uni-hamburg.de]
Sent: 02 November 2010 22:15
To: Santosh Srinivas
Cc: r-help@r-project.org
Subject: Re: [R] Setting the names of a data.frame

Wait wait,

If sHeaders is actually the first line of tData, the question is how do
you create/read this dataset in R? Isn't read from a text/csv file? In
that case, set the "header" argument to TRUE. If not, there are probably
better ways to do it, better than what you did (i.e. extract the first
line and reuse it).

In any case, that would be easier then (though still not the best way):
names(tData)<- unlist(lapply(tData[1, ], FUN=as.character))

Ivan

Le 11/2/2010 16:09, Santosh Srinivas a écrit :

Thanks.

Actually the sHeaders was a line in tData itself ...
I just did sHeaders = tData [1,]

How can I can build it without factors like your first suggestions?

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]

On

Behalf Of Ivan Calandra
Sent: 02 November 2010 20:22
To: r-help@r-project.org
Subject: Re: [R] Setting the names of a data.frame

Hi,

The problem is that all your columns of sHeaders are factors. It might
be better to set stringsAsFactors to FALSE when you build it.

Or you can do it with a for loop like this:
for (i in 1:length(sHeaders)){
names(tData)[i]<- as.character(sHeaders[1,i])
}

Or with lapply:
names(tData)<- unlist(lapply(sHeaders[1, ], FUN=as.character))

HTH,
Ivan

Le 11/2/2010 14:58, Santosh Srinivas a écrit :

I have tData as below. I need to set the names with the headers from the
first row in sHeaders
Sorry .. forgot how to set the names from row in another data frame ..

pls

advise.

names(tData) = sHeaders[1,] does not work correctly

Also, why doesn't drop.levels(sHeaders) not work?

dput(tData)
structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
Kumar",
"Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L,
3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B",
"S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
c("2120",
"4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L,
2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L,
3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"),
   V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
"factor")), .Names = c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class =

"data.frame")

dput(sHeaders)
structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller",
"Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
c("05/10/2010",
"%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
c("Buy /Sale",
"Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000",
"%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
.Label = c("",
"Holding after Transaction"), class = "factor"), V6 =

structure(NA_integer_,

.Label = "1000", class = "factor"),
   V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names

=

c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class =

"data.frame")


Thanks very  much.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] density() function: differences with S-PLUS

2010-11-02 Thread Joshua Wiley

Dear Nicola,

There are undoubtedly people here who are familiar with both S+ and R,
but they may not always be around or get to every question.  In that
case there are (at least) two good options for you:

1) Say what you want mathematically (something of a universal
language) or statistically

2) Rather than just give us S+ code, show sample data (e.g., 1:1000),
and the values you would like obtained (in this case whatever the
output from S+ was).  This would let us *try* to figure out what
happened and duplicate it in R.

>From the arcane step of reading R's documentation for density (?density):

width: this exists for compatibility with S; if given, and ‘bw’ is
  not, will set ‘bw’ to ‘width’ if this is a character string,
  or to a kernel-dependent multiple of ‘width’ if this is
  numeric.

Which makes me wonder if this works for you (in R)?

density(1:1000, width = 4)

Cheers,

Josh

On Tue, Nov 2, 2010 at 3:04 AM, Nicola Sturaro Sommacal (Quantide srl)
 wrote:
> Hello!
>
> Someone know what are the difference between R and S-PLUS in the density()
> function?
>
> For example, I would like to reply this simple S-PLUS code in R, but I don't
> understand which parameter I should modify to get the same results.
>
> S-PLUS CODE:
> density(1:1000, width = 4)
>
> R-CODE:
> density(1:1000, bw = 4, window = "g",  n = 50, cut = 0.75)
>
> I obtain the same x values, but different y values. I try also different
> examples, with different parameter.
>
> Can you help me?
>
> Thank you in advance.
>
> Nicola Sturaro
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using R for Production - Discussion

2010-11-02 Thread Saeed Abu Nimeh

I worked on a project where we used a random forest classifier to
predict a binary response. We trained a model in the ec2 cloud with 3
million observations and 44 features. We stored the model that was
generated by R using save(mymodel,file="model.Rdata"). Now we use
model.Rdata locally to predict new observations.
In our local system, we built a parser in Perl to generate the csv
representation of the observation we want to predict, then we used
RSPerl to communicate between Perl and R. But there is a catch,
instead of loading the random forest model (model.Rdata) every time we
want to predict a new observation, we have an R console running as a
daemon with the model.Rdata loaded already. Then, we send the
observation to be predicted from Perl to R. If anyone else has better
solutions/ideas, please feel free to share.
Thanks,
Saeed

On Mon, Nov 1, 2010 at 9:04 PM, Santosh Srinivas
 wrote:
> Hello Group,
>
> This is an open-ended question.
>
> Quite fascinated by the things I can do and the control I have on my
> activities since I started using R.
> I basically have been using this for analytical related work off my desktop.
> My experience has been quite good and most issues where I need to
> investigate and solve are typical items more related to data errors, format
> corruption, etc... not necessarily "R" Related.
>
> Complementing this with Python gives enough firepower to do lots of
> production (analytical related activities) on the cloud (from my research I
> see that every innovative technology provider seems to support Python ...
> google, amazon, etc).
>
> Question on using R for Production activities:
> Q1) Does anyone have experience of using R-scripts etc ... for production
> related activities. E.g. serving off a computational/ analytical /
> simulation environment from a webportal with the analytical processing done
> in R.
> I've seen that most useful things for normal (not rocket science) business
> (80-20 rule) can be done just as well in R in comparison with tools like
> SAS, Matlab, etc.
>
> Q2) I haven't tried the processing routines for much larger data-sets
> assuming "size" is not a constraint nowadays.
> I know that I should try out ... but any forewarnings would help. Is it
> likely that something that works for my "desktop" dataset is quite as likely
> to work when scaled up to a "cloud dataset"?
> Assuming that I do the clearing out of unused objects, not running into
> infinite loops, etc?
>
> i.e. is there any problem with the "fundamental architecture of R itself"?
> (like press articles often say)
>
>
> Q3) There are big fans of the SAS, Matlab, Mathworks environments out there
>  does anyone have a comparison of how R fares.
> >From my experience R is quite neat and low level ... so overheads should be
> quite low.
> Most slowness comes due to lack of knowledge (see my code ... like using the
> wrong structures, functions, loops, etc.) rather than something wrong with
> the way R itself is.
> Perhaps there is no "commercial" focus to enhance performance related issues
> but my guess is that it is just matter of time till the community evolves
> the language to score higher on that too.
> And perhaps develops documentation to assist the challenge users with
> "performance tips" (the ten commandments types)
>
> Q4) You must have heard about the latest comment from James Goodnight of SAS
> ... "We haven't noticed that a lot. Most of our companies need industrial
> strength software that has been tested, put through every possible scenario
> or failure to make sure everything works correctly."
> My "gut" is that random passionate geeks (playing part-time) do better
> testing than a military of professionals ... (but I've no empirical evidence
> here)
>
> I am not taking a side here (although I appreciate those who do!) .. but
> looking for an objective reasoning.
>
> Thanks,
> S
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Setting the names of a data.frame

2010-11-02 Thread Santosh Srinivas

It is just read from a file that has introductory text in the beginning and
a the header starts slight below  so couldnt use header as such.
I just modified that dataset to ignore the earlier lines ... sHeaders =
tData[4,] &  tData = tData [5:end]

The original data was actually a readHTMLtable from a webpage.

Your solutions works well enough for my purpose ... thanks.


-Original Message-
From: Ivan Calandra [mailto:ivan.calan...@uni-hamburg.de] 
Sent: 02 November 2010 22:15
To: Santosh Srinivas
Cc: r-help@r-project.org
Subject: Re: [R] Setting the names of a data.frame

Wait wait,

If sHeaders is actually the first line of tData, the question is how do 
you create/read this dataset in R? Isn't read from a text/csv file? In 
that case, set the "header" argument to TRUE. If not, there are probably 
better ways to do it, better than what you did (i.e. extract the first 
line and reuse it).

In any case, that would be easier then (though still not the best way):
names(tData) <- unlist(lapply(tData[1, ], FUN=as.character))

Ivan

Le 11/2/2010 16:09, Santosh Srinivas a écrit :
> Thanks.
>
> Actually the sHeaders was a line in tData itself ...
> I just did sHeaders = tData [1,]
>
> How can I can build it without factors like your first suggestions?
>
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On
> Behalf Of Ivan Calandra
> Sent: 02 November 2010 20:22
> To: r-help@r-project.org
> Subject: Re: [R] Setting the names of a data.frame
>
> Hi,
>
> The problem is that all your columns of sHeaders are factors. It might
> be better to set stringsAsFactors to FALSE when you build it.
>
> Or you can do it with a for loop like this:
> for (i in 1:length(sHeaders)){
>names(tData)[i]<- as.character(sHeaders[1,i])
> }
>
> Or with lapply:
> names(tData)<- unlist(lapply(sHeaders[1, ], FUN=as.character))
>
> HTH,
> Ivan
>
> Le 11/2/2010 14:58, Santosh Srinivas a écrit :
>> I have tData as below. I need to set the names with the headers from the
>> first row in sHeaders
>> Sorry .. forgot how to set the names from row in another data frame ..
pls
>> advise.
>>
>> names(tData) = sHeaders[1,] does not work correctly
>>
>> Also, why doesn't drop.levels(sHeaders) not work?
>>
>> dput(tData)
>> structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
>> Kumar",
>> "Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L,
>> 3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
>> ), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B",
>> "S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
>> c("2120",
>> "4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L,
>> 2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L,
>> 3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"),
>>   V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
>> "factor")), .Names = c("V1",
>> "V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class =
> "data.frame")
>>
>> dput(sHeaders)
>> structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller",
>> "Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
>> c("05/10/2010",
>> "%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
>> c("Buy /Sale",
>> "Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000",
>> "%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
>> .Label = c("",
>> "Holding after Transaction"), class = "factor"), V6 =
> structure(NA_integer_,
>> .Label = "1000", class = "factor"),
>>   V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names
=
>> c("V1",
>> "V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class =
"data.frame")
>>
>>
>> Thanks very  much.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Setting the names of a data.frame

2010-11-02 Thread Ivan Calandra


Wait wait,

If sHeaders is actually the first line of tData, the question is how do 
you create/read this dataset in R? Isn't read from a text/csv file? In 
that case, set the "header" argument to TRUE. If not, there are probably 
better ways to do it, better than what you did (i.e. extract the first 
line and reuse it).


In any case, that would be easier then (though still not the best way):
names(tData) <- unlist(lapply(tData[1, ], FUN=as.character))

Ivan

Le 11/2/2010 16:09, Santosh Srinivas a écrit :

Thanks.

Actually the sHeaders was a line in tData itself ...
I just did sHeaders = tData [1,]

How can I can build it without factors like your first suggestions?

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ivan Calandra
Sent: 02 November 2010 20:22
To: r-help@r-project.org
Subject: Re: [R] Setting the names of a data.frame

Hi,

The problem is that all your columns of sHeaders are factors. It might
be better to set stringsAsFactors to FALSE when you build it.

Or you can do it with a for loop like this:
for (i in 1:length(sHeaders)){
   names(tData)[i]<- as.character(sHeaders[1,i])
}

Or with lapply:
names(tData)<- unlist(lapply(sHeaders[1, ], FUN=as.character))

HTH,
Ivan

Le 11/2/2010 14:58, Santosh Srinivas a écrit :

I have tData as below. I need to set the names with the headers from the
first row in sHeaders
Sorry .. forgot how to set the names from row in another data frame .. pls
advise.

names(tData) = sHeaders[1,] does not work correctly

Also, why doesn't drop.levels(sHeaders) not work?

dput(tData)
structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
Kumar",
"Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L,
3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B",
"S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
c("2120",
"4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L,
2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L,
3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"),
  V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
"factor")), .Names = c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class =

"data.frame")


dput(sHeaders)
structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller",
"Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
c("05/10/2010",
"%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
c("Buy /Sale",
"Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000",
"%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
.Label = c("",
"Holding after Transaction"), class = "factor"), V6 =

structure(NA_integer_,

.Label = "1000", class = "factor"),
  V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names =
c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class = "data.frame")


Thanks very  much.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] system() and system2() functions

2010-11-02 Thread Uwe Ligges




On 02.11.2010 15:16, Ralph Olsson wrote:

Hello,

I help to maintain a moderate library of R code. In this code we have a number
of calls to the system function along the lines of:

 exe_output = system("./executable.exe",intern=T)

We tend to prefer system() over shell() because, provided the executable has
been compiled and the working directory set, the command works under both linux
and windows.

We've never had a problem with this code using R 2.9 and lower, but I've
recently started testing code in R 2.12 and have been getting "CreateProcess
failed to run..." error messages.

I've not found much info on this in the change logs/release notes, but from what
I have found I am under the impression that system() no longer "shell quotes"
the command passed to it (if I "shQuote()" the command the code runs fine). I
also see from the help files that a new function "system2()" has been introduced
which takes a different set of arguments and appears to be under development
(from the help page: "system2is the beginnings of a more portable interface than
system").

Since I assume there to be good reasons for this change to system I'm happy to
spend the time updating our library to work under R 2.12, but before I commence
on this task I wanted to try to get a better understanding of what changes have
been made to system().

My questions are:

1) What is the nature of and motivation for the changes to the system()
function?


Many, one of them is that system() had different behaviour under Linux 
vs. Windows.





2) What does system2() offer that system does not?


Portability.



3) Can anyone recommend the "best" (in particular most future-proof) way of
updating our system calls, preferably, and this may be a big ask, such that they
work in both R 2.9 and R2.12 under both linux and windows?


If it should work for R < 2.12.0, then use system() and add, at least 
for Windows, a shell command (such as "cmd") that allows the executable 
to run under the Windows command shell. Or better, use shell() right 
away, you need to special case for Windows anyway.


Best,
Uwe Ligges



If any of these questions have previously been answered and I've simply failed
in my googling, links would be appreciated.

Many thanks for your time,

Ralph
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Add more functions or dataset in my package

2010-11-02 Thread Uwe Ligges


See the manual "Writing R Extensions".

Uwe Ligges


On 02.11.2010 17:13, Carla Moreira wrote:

Hello,

I have constructed an R package, however, now I need to add a  dataset in
the package. How can I do it?
Thank you very much in advance.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset a data.frame

2010-11-02 Thread Simone Gabbriellini

many thanks, works perfectly!

best,
Simone

Il giorno 02/nov/2010, alle ore 17.17, David Winsemius ha scritto:

> 
> On Nov 2, 2010, at 11:53 AM, Simone Gabbriellini wrote:
> 
>> Hello List,
>> 
>> this should be simple, but cannot figure it out. I am trying to subset a 
>> data.frame like this:
>> 
>>> data4
>>  userstime
>> 1  user5 2009-12-01 14:09:58
>> 2  user1 2009-12-01 14:40:16
>> 3  user8 2009-12-04 08:18:37
>> 4  user6 2009-12-04 08:18:37
>> 5 user83 2009-12-04 08:18:37
>> 6 user82 2009-12-04 08:18:37
>> 7 user31 2009-12-04 08:18:37
>> 8 user85 2009-12-04 08:18:37
>> 9 user33 2009-12-04 08:18:37
>> 10 user2 2010-01-05 07:18:36
>> 
>> I would like to subset it and retain, let's say, only the data with time < 
>> '2010-01-05 07:18:36', but I have no idea about the sintax to do that.
>> 
>> is something like this close to the correct way:
>> 
>> active<-data4['time'<= as.POSIXct("2010-01-05 07:18:36", origin="1970-01-01 
>> 00:00:00-00")]
> 
> Close. Try:
> 
> active <- data4[data4$time <= as.POSIXct("2010-01-05 07:18:36", 
> origin="1970-01-01 00:00:00-00") , ]
> 
> Or:
> 
> active <- subset(data4, time <= as.POSIXct("2010-01-05 07:18:36", 
> origin="1970-01-01 00:00:00-00") )
> 
>> 
> 
> 
> --
> 
> David Winsemius, MD
> West Hartford, CT
> 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strings from different locale

2010-11-02 Thread Phil Spector


Steven -
   Does typing

Sys.setlocale('LC_ALL','C')

before the offending command suppress the message?

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Mon, 1 Nov 2010, steven mosher wrote:


I'm doing some test processing of a cvs file that appears to use a different
locale
from my machine.

I get the following warning:

input string 1 is invalid in this locale

My locale is US. Is this simply a matter of changing my locale to 'all;
locales?

I don't know what locale the string is in, is there a way to detect this or
translate

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Add more functions or dataset in my package

2010-11-02 Thread Carla Moreira

Hello,

I have constructed an R package, however, now I need to add a  dataset in
the package. How can I do it?
Thank you very much in advance.
-- 
Carla Moreira

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset a data.frame

2010-11-02 Thread David Winsemius



On Nov 2, 2010, at 11:53 AM, Simone Gabbriellini wrote:


Hello List,

this should be simple, but cannot figure it out. I am trying to  
subset a data.frame like this:



data4

  userstime
1  user52009-12-01 14:09:58
2  user12009-12-01 14:40:16
3  user82009-12-04 08:18:37
4  user62009-12-04 08:18:37
5 user832009-12-04 08:18:37
6 user822009-12-04 08:18:37
7 user312009-12-04 08:18:37
8 user852009-12-04 08:18:37
9 user332009-12-04 08:18:37
10 user22010-01-05 07:18:36

I would like to subset it and retain, let's say, only the data with  
time < '2010-01-05 07:18:36', but I have no idea about the sintax to  
do that.


is something like this close to the correct way:

active<-data4['time'<= as.POSIXct("2010-01-05 07:18:36",  
origin="1970-01-01 00:00:00-00")]


Close. Try:

 active <- data4[data4$time <= as.POSIXct("2010-01-05 07:18:36",  
origin="1970-01-01 00:00:00-00") , ]


Or:

active <- subset(data4, time <= as.POSIXct("2010-01-05 07:18:36",  
origin="1970-01-01 00:00:00-00") )







--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using data( ) in a loop

2010-11-02 Thread Dennis Murphy

Hi:

On Tue, Nov 2, 2010 at 8:06 AM, McCarthy, Ian <
ian.mccar...@fticonsulting.com> wrote:

> I'm trying to generate 50+ graphs using the UScensus2000tract data.  I
> need to access the data for just about all of the states, so I was
> hoping to create a simple loop that will take the relevant state from my
> data and load the associated census data from the UScensus2000tract
> package.   Below is a sample of what I'm trying to do.  Any suggestions
> are much appreciated.
>
>
> stores=read.table(paste(path,"\\StoreList.txt",sep=""),header=TRUE,sep="\t")
> city=stores$City
> state=stores$State
> city.state=data.frame(city,state)
>
> Or more succinctly,
city.state <- stores[ , c('city', 'state')]

>
>
> state.temp=city.state$state[1]
> tract <- paste(state.temp,".tract",sep="")
>

I believe you need get() here, but I would think you'd need a path to the
state file you want to grab. See ?get

HTH,
Dennis

>
> data(tract)
>
> Warning message:
>
> In data(tract) : data set 'tract' not found
>
>
>
>
>
>
>
> Ian McCarthy, Ph.D.
>
> F T I
>
> 214.397.1761 direct
>
> 214.663.1683 mobile
>
> ian.mccar...@fticonsulting.com
> 
>
>
>
>
>
> Confidentiality Notice:\ \ This email and any attachment...{{dropped:16}}
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] individual intercept and slope

2010-11-02 Thread Phil Spector

You didn't say what form you wanted the output in, but 
here's one way:



sapply(split(dat,dat$individual),function(s)lm(height~time,data=s)$coef)

   1 2
(Intercept) 8.47 19.87
time2.485714 -2.057143

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Tue, 2 Nov 2010, Rosario Garcia Gil wrote:


Hello

I would like to extract the estimates for the intercept and slope by individual 
for growth from a lm fit.
Any advice?

Individual Time point  Height
1   1   10
1   2   11
1   3   23
1   4   15
1   5   21
1   6   23
2   1   24
2   2   12
2   3   9
2   4   10
2   5   11
2   6   10
...

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] subset a data.frame

2010-11-02 Thread Simone Gabbriellini

Hello List,

this should be simple, but cannot figure it out. I am trying to subset a 
data.frame like this:

> data4
   userstime
1  user52009-12-01 14:09:58
2  user12009-12-01 14:40:16
3  user82009-12-04 08:18:37
4  user62009-12-04 08:18:37
5 user832009-12-04 08:18:37
6 user822009-12-04 08:18:37
7 user312009-12-04 08:18:37
8 user852009-12-04 08:18:37
9 user332009-12-04 08:18:37
10 user22010-01-05 07:18:36

I would like to subset it and retain, let's say, only the data with time < 
'2010-01-05 07:18:36', but I have no idea about the sintax to do that. 

is something like this close to the correct way:

active<-data4['time'<= as.POSIXct("2010-01-05 07:18:36", origin="1970-01-01 
00:00:00-00")]

thanks in advance for any help.

best regards,
Simone Gabbriellini
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Setting the names of a data.frame

2010-11-02 Thread Santosh Srinivas

Thanks. 

Actually the sHeaders was a line in tData itself ...
I just did sHeaders = tData [1,]

How can I can build it without factors like your first suggestions?

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Ivan Calandra
Sent: 02 November 2010 20:22
To: r-help@r-project.org
Subject: Re: [R] Setting the names of a data.frame

Hi,

The problem is that all your columns of sHeaders are factors. It might 
be better to set stringsAsFactors to FALSE when you build it.

Or you can do it with a for loop like this:
for (i in 1:length(sHeaders)){
  names(tData)[i] <- as.character(sHeaders[1,i])
}

Or with lapply:
names(tData) <- unlist(lapply(sHeaders[1, ], FUN=as.character))

HTH,
Ivan

Le 11/2/2010 14:58, Santosh Srinivas a écrit :
> I have tData as below. I need to set the names with the headers from the
> first row in sHeaders
> Sorry .. forgot how to set the names from row in another data frame .. pls
> advise.
>
> names(tData) = sHeaders[1,] does not work correctly
>
> Also, why doesn't drop.levels(sHeaders) not work?
>
> dput(tData)
> structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
> Kumar",
> "Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L,
> 3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
> ), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B",
> "S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
> c("2120",
> "4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L,
> 2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L,
> 3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"),
>  V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
> "factor")), .Names = c("V1",
> "V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class =
"data.frame")
>
>
> dput(sHeaders)
> structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller",
> "Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
> c("05/10/2010",
> "%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
> c("Buy /Sale",
> "Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000",
> "%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
> .Label = c("",
> "Holding after Transaction"), class = "factor"), V6 =
structure(NA_integer_,
> .Label = "1000", class = "factor"),
>  V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names =
> c("V1",
> "V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class = "data.frame")
>
>
> Thanks very  much.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using data( ) in a loop

2010-11-02 Thread McCarthy, Ian

I'm trying to generate 50+ graphs using the UScensus2000tract data.  I
need to access the data for just about all of the states, so I was
hoping to create a simple loop that will take the relevant state from my
data and load the associated census data from the UScensus2000tract
package.   Below is a sample of what I'm trying to do.  Any suggestions
are much appreciated.

 

stores=read.table(paste(path,"\\Store
List.txt",sep=""),header=TRUE,sep="\t")

city=stores$City

state=stores$State

city.state=data.frame(city,state)

 

state.temp=city.state$state[1]

tract <- paste(state.temp,".tract",sep="")

data(tract)

Warning message:

In data(tract) : data set 'tract' not found

 

 

 

Ian McCarthy, Ph.D.

F T I 

214.397.1761 direct

214.663.1683 mobile

ian.mccar...@fticonsulting.com
 

 

 

Confidentiality Notice:\ \ This email and any attachment...{{dropped:16}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Setting the names of a data.frame

2010-11-02 Thread Ivan Calandra


Hi,

The problem is that all your columns of sHeaders are factors. It might 
be better to set stringsAsFactors to FALSE when you build it.


Or you can do it with a for loop like this:
for (i in 1:length(sHeaders)){
 names(tData)[i] <- as.character(sHeaders[1,i])
}

Or with lapply:
names(tData) <- unlist(lapply(sHeaders[1, ], FUN=as.character))

HTH,
Ivan

Le 11/2/2010 14:58, Santosh Srinivas a écrit :

I have tData as below. I need to set the names with the headers from the
first row in sHeaders
Sorry .. forgot how to set the names from row in another data frame .. pls
advise.

names(tData) = sHeaders[1,] does not work correctly

Also, why doesn't drop.levels(sHeaders) not work?

dput(tData)
structure(list(V1 = structure(c(3L, 1L, 1L, 2L), .Label = c("P H Ravi
Kumar",
"Rahul Kumar Singh", "Ramu GSV"), class = "factor"), V2 = structure(c(1L,
3L, 3L, 2L), .Label = c("05/10/2010", "09/09/2010", "30/09/2010"
), class = "factor"), V3 = structure(c(2L, 1L, 1L, 2L), .Label = c("B",
"S"), class = "factor"), V4 = structure(c(2L, 3L, 3L, 1L), .Label =
c("2120",
"4000", "11000"), class = "factor"), V5 = structure(c(1L, 2L,
2L, 1L), .Label = c("", "0.01"), class = "factor"), V6 = structure(c(2L,
3L, 3L, 1L), .Label = c("765", "1000", "11000"), class = "factor"),
 V7 = structure(c(1L, 2L, 2L, 1L), .Label = c("", "0.01"), class =
"factor")), .Names = c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 5:8, class = "data.frame")


dput(sHeaders)
structure(list(V1 = structure(1L, .Label = c("Name of Acquirer / Seller",
"Qty", "Ramu GSV"), class = "factor"), V2 = structure(3L, .Label =
c("05/10/2010",
"%", "Transaction Date"), class = "factor"), V3 = structure(1L, .Label =
c("Buy /Sale",
"Qty", "S"), class = "factor"), V4 = structure(3L, .Label = c("4000",
"%", "No.of Shares Transacted"), class = "factor"), V5 = structure(2L,
.Label = c("",
"Holding after Transaction"), class = "factor"), V6 = structure(NA_integer_,
.Label = "1000", class = "factor"),
 V7 = structure(NA_integer_, .Label = "", class = "factor")), .Names =
c("V1",
"V2", "V3", "V4", "V5", "V6", "V7"), row.names = 3L, class = "data.frame")


Thanks very  much.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/mitarbeiter.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] spliting first 10 words in a string

2010-11-02 Thread David Winsemius



On Nov 2, 2010, at 6:24 AM, Gaj Vidmar wrote:

Though  in this list, in Excel it's just (literally!)  
five clicks

away!
(with the column in question selected)
Data -> Text to Columns -> Delimited -> tick Space -> Finish
Pa je! (~Voila in Slovenian)
(then import back to R, keeping only the first 10 columns if so  
desired)


You could do the same thing without needing to leave R. Just  
read.table( textConnection(..), header=FALSE, fill=TRUE)


> read.table(textConnection(words), fill=T)
   V1V2V3  V4V5V6V7  V8   V9  
V10  V11   V12 V13 V14
1   I  have a columnn  with  text  that hasquite
a  few words  in it.

2   I would  like  to split these words  in separate columns
3 but  just first ten wordsin   the string.   Isthat  
possiblein  R?




Regards,
Assist. Prof. Gaj Vidmar, PhD
University Rehabilitattion Institute, Republic of Slovenia

Irrelevant P.S. Long ago, before embarking on what eventually ended  
mainly

in statistics,
I did two years of geology, so (and also because of knowing what the
poster's institute does)
I even kinda imagine what these data are.

"Matev¾ Pavliè"  wrote in message
news:ad5ca6183570b54f92aa45ce2619f9b9d96...@gi-zrmk.si...

Hi,

I am sorry, will try to be more exact from now on...

I have a data.frame  with a field called Opis. IT contains  
sentenses that

I would like to split in words or fields in data.frame...when I say
columns I mean as in Excel table. I would like to split "Opis" into  
ten

fields from the first ten words in Opis field.
Here is an example of my data.frame.

'data.frame':   22928 obs. of  12 variables:
$ VrtinaID: int  1 1 1 1 2 2 2 2 2 2 ...
$ ZapStev : int  1 2 3 4 1 2 3 4 5 6 ...
$ GlobinaOd   : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
$ GlobinaDo   : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
$ Opis: Factor w/ 12754 levels "","(MIVKA) DROBEN MELJAST
PESEK, GOST, SIVORJAV",..: 2060 11588 2477 11660 7539 3182 7884  
9123 2500

4756 ...
$ ACklasifikacija : Factor w/ 290 levels "","(CL)","(CL)/(SC)",..:  
154 125

101 101 NA 106 125 80 106 101 ...
$ GeolNastOd  : num  0 0.8 9.2 10.1 0 0.9 2.6 4.9 6.8 7.3 ...
$ GeolNastDo  : num  0.8 9.2 10.1 11 0.9 2.6 4.9 6.8 7.3 8.2 ...
$ GeolNastOpis: Factor w/ 113 levels "","B. M. S.",..: 56 53 53  
53 56

53 53 53 53 53 ...
$ NacinVrtanjaOd  : num  0e+00 1e+09 1e+09 1e+09 0e+00 ...
$ NacinVrtanjaDo  : num  1.1e+01 1.0e+09 1.0e+09 1.0e+09 1.0e+01 ...
$ NacinVrtanjaOpis: Factor w/ 43 levels "","H. N.","IZKOP",..: 26 1  
1 1 26

1 1 1 1 1 ...

Hope that explains better...
Thank you, m

-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Monday, November 01, 2010 10:13 PM
To: Matev¾ Pavliè
Cc: r-help@r-project.org
Subject: Re: [R] spliting first 10 words in a string


On Nov 1, 2010, at 4:39 PM, Matev¾ Pavliè wrote:


Hi all,



I have a columnn with text that has quite a few words in it. I would
like to split these words in separate columns, but just first ten
words in the string. Is that possible in R?




Not sure what a column means to you. It's not a precisely defined R
type or class. (And you are requested to offered a concrete example
rather than making us guess.)


words <-"I have a columnn with text that has quite a few words in

it. I would like to split these words in separate columns, but just
first ten words in the string. Is that possible in R?"


strsplit(words, " ")[[1]][1:10]

[1] "I"   "have""a"   "columnn" "with""text"
"that""has" "quite"   "a"


Or if in a dataframe:


words <-c("I have a columnn with text that has quite a few words in

it.",   "I would like to split these words in separate columns", "but
just first ten words in the string. Is that possible in R?")

worddf <- data.frame(words=words)



t(sapply(strsplit(worddf$words, " "), "[", 1:10) )

[,1]  [,2][,3][,4]  [,5][,6][,7][,
8]  [,9]   [,10]
[1,] "I"   "have"  "a" "columnn" "with"  "text"  "that"  "has"
"quite""a"
[2,] "I"   "would" "like"  "to"  "split" "these" "words" "in"
"separate" "columns"
[3,] "but" "just"  "first" "ten" "words" "in""the"
"string."

"Is"   "that"


--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing li

Re: [R] Using R for Production - Discussion

2010-11-02 Thread Douglas Bates

On Mon, Nov 1, 2010 at 11:04 PM, Santosh Srinivas
 wrote:
> Hello Group,
>
> This is an open-ended question.
>
> Quite fascinated by the things I can do and the control I have on my
> activities since I started using R.
> I basically have been using this for analytical related work off my desktop.
> My experience has been quite good and most issues where I need to
> investigate and solve are typical items more related to data errors, format
> corruption, etc... not necessarily "R" Related.
>
> Complementing this with Python gives enough firepower to do lots of
> production (analytical related activities) on the cloud (from my research I
> see that every innovative technology provider seems to support Python ...
> google, amazon, etc).
>
> Question on using R for Production activities:
> Q1) Does anyone have experience of using R-scripts etc ... for production
> related activities. E.g. serving off a computational/ analytical /
> simulation environment from a webportal with the analytical processing done
> in R.
> I've seen that most useful things for normal (not rocket science) business
> (80-20 rule) can be done just as well in R in comparison with tools like
> SAS, Matlab, etc.
>
> Q2) I haven't tried the processing routines for much larger data-sets
> assuming "size" is not a constraint nowadays.
> I know that I should try out ... but any forewarnings would help. Is it
> likely that something that works for my "desktop" dataset is quite as likely
> to work when scaled up to a "cloud dataset"?
> Assuming that I do the clearing out of unused objects, not running into
> infinite loops, etc?
>
> i.e. is there any problem with the "fundamental architecture of R itself"?
> (like press articles often say)
>
>
> Q3) There are big fans of the SAS, Matlab, Mathworks environments out there
>  does anyone have a comparison of how R fares.
> >From my experience R is quite neat and low level ... so overheads should be
> quite low.
> Most slowness comes due to lack of knowledge (see my code ... like using the
> wrong structures, functions, loops, etc.) rather than something wrong with
> the way R itself is.
> Perhaps there is no "commercial" focus to enhance performance related issues
> but my guess is that it is just matter of time till the community evolves
> the language to score higher on that too.
> And perhaps develops documentation to assist the challenge users with
> "performance tips" (the ten commandments types)
>
> Q4) You must have heard about the latest comment from James Goodnight of SAS
> ... "We haven't noticed that a lot. Most of our companies need industrial
> strength software that has been tested, put through every possible scenario
> or failure to make sure everything works correctly."
> My "gut" is that random passionate geeks (playing part-time) do better
> testing than a military of professionals ... (but I've no empirical evidence
> here)
>
> I am not taking a side here (although I appreciate those who do!) .. but
> looking for an objective reasoning.

Regarding performance and size of data sets I would suggest viewing
the presentation that Dirk Eddelbuettel and Romain Francois gave at
Google recently.  David Smith links to it in his blog at
blog.revolutionanalytics.com

One of the advantages of Open Source systems is that people can
provide many different kinds of hooks into the code.

At present any R vector objects use 32-bit signed integers for
indexing, which limits the size of an individual vector to 2^{31}-1.
There are some methods available for using external storage to by-pass
this but they do introduce another level of complexity.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about ggplot2

2010-11-02 Thread Joshua Wiley

Dear Shige,

You can use scale_y_continuous() to achieve this.

year.plot <- ggplot(d, aes(year, rate))
year.plot + stat_summary(fun.y = "mean", geom = "line") +
  scale_y_continuous(limits = c(0, .1))

where limits may be whatever you like for the y axis.

Cheers,

Josh

On Tue, Nov 2, 2010 at 6:57 AM, Shige Song  wrote:
> Dear All,
>
> I am trying to graph a simple scatter plot where the x axis is year
> and the y axis is a percentage (percentage of infant death). Instead
> of plotting the raw data, I want to plot summary statistics such as
> mean and median. Here is the problem: the value range of y is between
> 0 and 1, but since infant death is a rare event, the mean and median
> is very low (something like 5%), which shows up as a horizontal line
> at the bottom of the figure. My question is: how do I change the scale
> of the y-axis so that it does not have the range between 0 and 1 but
> between 0 and 0.1? Many thanks.
>
> By the way, I am using ggplot2, and here is my code:
>
> ---
> year.plot <- ggplot(d, aes(year, rate))
> year.plot + stat_summary(fun.y = "mean", geom = "line")
> ---
>
> Best,
> Shige
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question about ggplot2

2010-11-02 Thread Abhijit Dasgupta

from where you are, 

year.plot+ylim(0,0.1)

Abhijit

On Nov 2, 2010, at 9:57 AM, Shige Song wrote:

> Dear All,
> 
> I am trying to graph a simple scatter plot where the x axis is year
> and the y axis is a percentage (percentage of infant death). Instead
> of plotting the raw data, I want to plot summary statistics such as
> mean and median. Here is the problem: the value range of y is between
> 0 and 1, but since infant death is a rare event, the mean and median
> is very low (something like 5%), which shows up as a horizontal line
> at the bottom of the figure. My question is: how do I change the scale
> of the y-axis so that it does not have the range between 0 and 1 but
> between 0 and 0.1? Many thanks.
> 
> By the way, I am using ggplot2, and here is my code:
> 
> ---
> year.plot <- ggplot(d, aes(year, rate))
> year.plot + stat_summary(fun.y = "mean", geom = "line")
> ---
> 
> Best,
> Shige
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] individual intercept and slope

2010-11-02 Thread Dimitris Rizopoulos


Have a look at function(s) lmList() from packages lme4 or nlme.

I hope it helps.

Best,
Dimitris


On 11/2/2010 3:14 PM, Rosario Garcia Gil wrote:

Hello

I would like to extract the estimates for the intercept and slope by individual 
for growth from a lm fit.
Any advice?

Individual Time point  Height
1   1   10
1   2   11
1   3   23
1   4   15
1   5   21
1   6   23
2   1   24
2   2   12
2   3   9
2   4   10
2   5   11
2   6   10
...

Thanks

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error message in fit.mult.impute (Hmisc package)

2010-11-02 Thread Kim Fernandes

Thank you! That has fixed the problem.

Kim

On Tue, Nov 2, 2010 at 7:42 AM, Frank Harrell wrote:

>
> I tried your code with the rms package (replacement for the Design package;
> see http://biostat.mc.vanderbilt.edu/Rrms) and it worked fine.
>
> Note that multiple imputation needs the outcome variable in the imputation
> model.
>
> Frank
>
>
> -
> Frank Harrell
> Department of Biostatistics, Vanderbilt University
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Error-message-in-fit-mult-impute-Hmisc-package-tp3022817p3023563.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] system() and system2() functions

2010-11-02 Thread Ralph Olsson

Hello,

I help to maintain a moderate library of R code. In this code we have a number 
of calls to the system function along the lines of:

exe_output = system("./executable.exe",intern=T)

We tend to prefer system() over shell() because, provided the executable has 
been compiled and the working directory set, the command works under both linux 
and windows.

We've never had a problem with this code using R 2.9 and lower, but I've 
recently started testing code in R 2.12 and have been getting "CreateProcess 
failed to run..." error messages.

I've not found much info on this in the change logs/release notes, but from 
what 
I have found I am under the impression that system() no longer "shell quotes" 
the command passed to it (if I "shQuote()" the command the code runs fine). I 
also see from the help files that a new function "system2()" has been 
introduced 
which takes a different set of arguments and appears to be under development 
(from the help page: "system2is the beginnings of a more portable interface 
than 
system").

Since I assume there to be good reasons for this change to system I'm happy to 
spend the time updating our library to work under R 2.12, but before I commence 
on this task I wanted to try to get a better understanding of what changes have 
been made to system().

My questions are:

1) What is the nature of and motivation for the changes to the system() 
function?

2) What does system2() offer that system does not?

3) Can anyone recommend the "best" (in particular most future-proof) way of 
updating our system calls, preferably, and this may be a big ask, such that 
they 
work in both R 2.9 and R2.12 under both linux and windows?

If any of these questions have previously been answered and I've simply failed 
in my googling, links would be appreciated.

Many thanks for your time,

Ralph
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 125 matches

Mail list logo