subject:"\[R\] Data"

Re: [R] Data Manipulation

2010-01-26 Thread Peter Rote


I still struggling with this:

 error massage:

>  by(AlexETF,AlexETF$Industry,function(a) {filename = paste("C:/ab/",gsub("
> ","",a$Industry[1]),".txt",sep="")
+ print(filename)
+ write.table(a[,3,drop=FALSE],quote=FALSE,col.names=FALSE,row.names=FALSE)
+ }
+  )  

[1] "C:/ab/Accident&HealthInsurance.txt"
Error in `[.data.frame`(a, , 3, drop = FALSE) :
  undefined columns selected

Best,
Peter 
-- 
View this message in context: 
http://n4.nabble.com/Data-Manipulation-tp1018249p1290191.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2010-01-25 Thread Gabor Grothendieck

Try this:

> t(apply(x, 1, function(r) table(factor(r, levels = seq_len(max(x))
 1 2 3 4 5 6 7 8 9 10
[1,] 1 0 1 0 0 0 0 0 0  0
[2,] 0 2 0 0 0 0 0 0 0  0
[3,] 0 0 0 1 0 0 1 0 0  0
[4,] 0 0 0 0 0 1 0 1 0  0
[5,] 0 0 0 0 1 0 0 0 0  1

If you use aaply in the plyr package instead of apply then you can
omit the transpose.


On Mon, Jan 25, 2010 at 5:39 PM, Lisa  wrote:
>
> Dear all,
>
> I  have a dataset that looks like this:
>
> x <- read.table(textConnection("col1 col2
> 3 1
> 2 2
> 4 7
> 8 6
> 5 10"), header=TRUE)
>
> I want to rewrite it as below:
>
> var1 var2 var3 var4 var5 var6 var7 var8 var9 var10
>    1     0     1      0     0     0     0     0      0      0
>    0     2     0      0     0     0     0     0      0      0
>    0     0     0      1     0     0     1     0      0      0
>    0     0     0      0     0     1     0     1      0      0
>    0     0     0      0     1     0     0     0      0      1
>
> Can anybody please help how to get this done? Your help would be greatly
> appreciated.
>
> Lisa
>
> --
> View this message in context: 
> http://n4.nabble.com/Data-transformation-tp1289899p1289899.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2010-01-25 Thread Seeliger . Curt

r-help-boun...@r-project.org wrote on 01/25/2010 02:39:32 PM:
> x <- read.table(textConnection("col1 col2 
> 3 1 
> 2 2 
> 4 7 
> 8 6 
> 5 10"), header=TRUE) 
> 
> I want to rewrite it as below:
> 
> var1 var2 var3 var4 var5 var6 var7 var8 var9 var10
> 1 0 1  0 0 0 0 0  0  0
> 0 2 0  0 0 0 0 0  0  0
> 0 0 0  1 0 0 1 0  0  0
> 0 0 0  0 0 1 0 1  0  0
> 0 0 0  0 1 0 0 0  0  1
> 
> Can anybody please help how to get this done? Your help would be greatly
> appreciated. 

Thanks, I've not seen textConnection() before.  The table() function will 
get you close:

table(c(rownames(x),rownames(x)), c(x$col1,x$col2))

cur
-- 
Curt Seeliger, Data Ranger
Raytheon Information Services - Contractor to ORD
seeliger.c...@epa.gov
541/754-4638



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2010-01-25 Thread Lisa


Thank you so much.

Lisa
-- 
View this message in context: 
http://n4.nabble.com/Data-transformation-tp1289899p1289915.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2010-01-25 Thread Steve Lianoglou

Hi,

On Mon, Jan 25, 2010 at 5:39 PM, Lisa  wrote:
>
> Dear all,
>
> I  have a dataset that looks like this:
>
> x <- read.table(textConnection("col1 col2
> 3 1
> 2 2
> 4 7
> 8 6
> 5 10"), header=TRUE)
>
> I want to rewrite it as below:
>
> var1 var2 var3 var4 var5 var6 var7 var8 var9 var10
>    1     0     1      0     0     0     0     0      0      0
>    0     2     0      0     0     0     0     0      0      0
>    0     0     0      1     0     0     1     0      0      0
>    0     0     0      0     0     1     0     1      0      0
>    0     0     0      0     1     0     0     0      0      1
>
> Can anybody please help how to get this done? Your help would be greatly
> appreciated.

I was trying to do it w/o for loops, but I can't figure out a way to do so:

R> bounds <- range(x)
R> m <- matrix(0, nrow=nrow(x), ncol=bounds[2])
R> colnames(m) <- paste('var', seq(bounds[2]), sep="")
## Ugly nested for-loop one-liner below
R> for (i in 1:nrow(x))for (j in 1:ncol(x)) m[i,x[i,j]] <- m[i,x[i,j]] + 1
R> m

 var1 var2 var3 var4 var5 var6 var7 var8 var9 var10
[1,]101000000 0
[2,]020000000 0
[3,]000100100 0
[4,]000001010 0
[5,]000010000 1

-steve

-- 
Steve Lianoglou
Graduate Student: Computational Systems Biology
 | Memorial Sloan-Kettering Cancer Center
 | Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2010-01-25 Thread Sarah Goslee

Well, I have no idea how to get from one to the other. There's
col1 and col2 but no var1 var2 var3, etc. I thought perhaps col1
was the row index and col2 was the column index, but that doesn't
match up either, and not all the cell values are 1.

So you will need to explain more clearly what you intend.

Meanwhile, you might try reshape, or perhaps crosstab from the
ecodist package.

Sarah

On Mon, Jan 25, 2010 at 5:39 PM, Lisa  wrote:
>
> Dear all,
>
> I  have a dataset that looks like this:
>
> x <- read.table(textConnection("col1 col2
> 3 1
> 2 2
> 4 7
> 8 6
> 5 10"), header=TRUE)
>
> I want to rewrite it as below:
>
> var1 var2 var3 var4 var5 var6 var7 var8 var9 var10
>    1     0     1      0     0     0     0     0      0      0
>    0     2     0      0     0     0     0     0      0      0
>    0     0     0      1     0     0     1     0      0      0
>    0     0     0      0     0     1     0     1      0      0
>    0     0     0      0     1     0     0     0      0      1
>
> Can anybody please help how to get this done? Your help would be greatly
> appreciated.
>
> Lisa
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data transformation

2010-01-25 Thread Lisa


Dear all,

I  have a dataset that looks like this:

x <- read.table(textConnection("col1 col2 
3 1 
2 2 
4 7 
8 6 
5 10"), header=TRUE) 

I want to rewrite it as below:

var1 var2 var3 var4 var5 var6 var7 var8 var9 var10
1 0 1  0 0 0 0 0  0  0
0 2 0  0 0 0 0 0  0  0
0 0 0  1 0 0 1 0  0  0
0 0 0  0 0 1 0 1  0  0
0 0 0  0 1 0 0 0  0  1

Can anybody please help how to get this done? Your help would be greatly
appreciated. 

Lisa 

-- 
View this message in context: 
http://n4.nabble.com/Data-transformation-tp1289899p1289899.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-24 Thread Henrique Dallazuanna

The same error for me:


"Not Found
The requested object does not exist on this server. The link you
followed is either outdated, inaccurate, or the server has been
instructed not to let you have it. Please inform the site
administrator of the referring page."

On Sun, Jan 24, 2010 at 3:14 PM, Peter Ehlers  wrote:
> That's not the case for me:
>
> Not Found
> The requested object does not exist on this server. The link you followed is
> either outdated, inaccurate, or the server has been instructed not to let
> you have it.
> Firefox 3.6
>
>  -Peter Ehlers
>
> Velappan Periasamy wrote:
>>
>>
>> http://nseindia.com/content/equities/scripvol/datafiles/01-01-2010-TO-23-01-2010RCOMXN.csv
>>
>>
>> the  url is correct. it is not zipped file.
>> copy the url in the browser window you will get the
>> this ..
>>
>>
>> Symbol,Series,Date, Prev Close,Open Price,High Price,Low Price,Last
>> Price,Close Price,Average Price,Total Traded Quantity,Turnover in
>> Lacs,
>>
>> RCOM,EQ,04-Jan-2010,172.35,173,175.8,172.55,175.25,175.2,174.17,2418999,4213.1160435,
>>
>> RCOM,EQ,05-Jan-2010,175.2,176,182,175.8,181.45,181.35,178.64,6033757,10778.7459905,
>>
>> RCOM,EQ,06-Jan-2010,181.35,182.5,184.4,180.8,181.5,181.8,182.76,4680776,8554.525768,
>>
>> RCOM,EQ,07-Jan-2010,181.8,183.7,185.3,182.5,183.8,183.9,184.05,4255338,7831.773937,
>>
>> RCOM,EQ,08-Jan-2010,183.9,184.5,185.15,180.2,181.1,180.85,182.14,3775898,6877.5970215,
>>
>> RCOM,EQ,11-Jan-2010,180.85,184,184,180.2,181.85,182.1,182.04,3601269,6555.8894695,
>>
>> RCOM,EQ,12-Jan-2010,182.1,182.1,182.85,175.05,175.3,175.45,179.06,4834928,8657.6031315,
>>
>> RCOM,EQ,13-Jan-2010,175.45,174,176.6,173.05,175.7,175.55,175.01,3276310,5733.8242525,
>>
>> RCOM,EQ,14-Jan-2010,175.55,177.2,184.2,175.65,182.85,183,180.8,7227593,13067.2365775,
>>
>> RCOM,EQ,15-Jan-2010,183,183,193.4,183,191.3,191.6,191.03,15459863,29533.7056915,
>>
>> RCOM,EQ,18-Jan-2010,191.6,189.9,193.8,188.35,190.05,190.45,191.27,4710277,9009.3851875,
>>
>> RCOM,EQ,19-Jan-2010,190.45,190,192.9,185.2,186.5,186.35,188.67,4458474,8411.945425,
>>
>> RCOM,EQ,20-Jan-2010,186.35,187,190.6,185.3,186.6,186.9,188.05,3581194,6734.2921825,
>>
>> RCOM,EQ,21-Jan-2010,186.9,186.85,189.75,184.15,185.3,185.15,186.85,3673061,6863.2499155,
>>
>> RCOM,EQ,22-Jan-2010,185.15,183.7,184.7,176.45,181.6,181.55,181.5,4194634,7613.198626,
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-24 Thread Peter Ehlers


That's not the case for me:

Not Found
The requested object does not exist on this server. The link you 
followed is either outdated, inaccurate, or the server has been 
instructed not to let you have it.

Firefox 3.6

  -Peter Ehlers

Velappan Periasamy wrote:

http://nseindia.com/content/equities/scripvol/datafiles/01-01-2010-TO-23-01-2010RCOMXN.csv


the  url is correct. it is not zipped file.
copy the url in the browser window you will get the
this ..


Symbol,Series,Date, Prev Close,Open Price,High Price,Low Price,Last
Price,Close Price,Average Price,Total Traded Quantity,Turnover in
Lacs,
RCOM,EQ,04-Jan-2010,172.35,173,175.8,172.55,175.25,175.2,174.17,2418999,4213.1160435,
RCOM,EQ,05-Jan-2010,175.2,176,182,175.8,181.45,181.35,178.64,6033757,10778.7459905,
RCOM,EQ,06-Jan-2010,181.35,182.5,184.4,180.8,181.5,181.8,182.76,4680776,8554.525768,
RCOM,EQ,07-Jan-2010,181.8,183.7,185.3,182.5,183.8,183.9,184.05,4255338,7831.773937,
RCOM,EQ,08-Jan-2010,183.9,184.5,185.15,180.2,181.1,180.85,182.14,3775898,6877.5970215,
RCOM,EQ,11-Jan-2010,180.85,184,184,180.2,181.85,182.1,182.04,3601269,6555.8894695,
RCOM,EQ,12-Jan-2010,182.1,182.1,182.85,175.05,175.3,175.45,179.06,4834928,8657.6031315,
RCOM,EQ,13-Jan-2010,175.45,174,176.6,173.05,175.7,175.55,175.01,3276310,5733.8242525,
RCOM,EQ,14-Jan-2010,175.55,177.2,184.2,175.65,182.85,183,180.8,7227593,13067.2365775,
RCOM,EQ,15-Jan-2010,183,183,193.4,183,191.3,191.6,191.03,15459863,29533.7056915,
RCOM,EQ,18-Jan-2010,191.6,189.9,193.8,188.35,190.05,190.45,191.27,4710277,9009.3851875,
RCOM,EQ,19-Jan-2010,190.45,190,192.9,185.2,186.5,186.35,188.67,4458474,8411.945425,
RCOM,EQ,20-Jan-2010,186.35,187,190.6,185.3,186.6,186.9,188.05,3581194,6734.2921825,
RCOM,EQ,21-Jan-2010,186.9,186.85,189.75,184.15,185.3,185.15,186.85,3673061,6863.2499155,
RCOM,EQ,22-Jan-2010,185.15,183.7,184.7,176.45,181.6,181.55,181.5,4194634,7613.198626,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-23 Thread Velappan Periasamy

http://nseindia.com/content/equities/scripvol/datafiles/01-01-2010-TO-23-01-2010RCOMXN.csv


the  url is correct. it is not zipped file.
copy the url in the browser window you will get the
this ..


Symbol,Series,Date, Prev Close,Open Price,High Price,Low Price,Last
Price,Close Price,Average Price,Total Traded Quantity,Turnover in
Lacs,
RCOM,EQ,04-Jan-2010,172.35,173,175.8,172.55,175.25,175.2,174.17,2418999,4213.1160435,
RCOM,EQ,05-Jan-2010,175.2,176,182,175.8,181.45,181.35,178.64,6033757,10778.7459905,
RCOM,EQ,06-Jan-2010,181.35,182.5,184.4,180.8,181.5,181.8,182.76,4680776,8554.525768,
RCOM,EQ,07-Jan-2010,181.8,183.7,185.3,182.5,183.8,183.9,184.05,4255338,7831.773937,
RCOM,EQ,08-Jan-2010,183.9,184.5,185.15,180.2,181.1,180.85,182.14,3775898,6877.5970215,
RCOM,EQ,11-Jan-2010,180.85,184,184,180.2,181.85,182.1,182.04,3601269,6555.8894695,
RCOM,EQ,12-Jan-2010,182.1,182.1,182.85,175.05,175.3,175.45,179.06,4834928,8657.6031315,
RCOM,EQ,13-Jan-2010,175.45,174,176.6,173.05,175.7,175.55,175.01,3276310,5733.8242525,
RCOM,EQ,14-Jan-2010,175.55,177.2,184.2,175.65,182.85,183,180.8,7227593,13067.2365775,
RCOM,EQ,15-Jan-2010,183,183,193.4,183,191.3,191.6,191.03,15459863,29533.7056915,
RCOM,EQ,18-Jan-2010,191.6,189.9,193.8,188.35,190.05,190.45,191.27,4710277,9009.3851875,
RCOM,EQ,19-Jan-2010,190.45,190,192.9,185.2,186.5,186.35,188.67,4458474,8411.945425,
RCOM,EQ,20-Jan-2010,186.35,187,190.6,185.3,186.6,186.9,188.05,3581194,6734.2921825,
RCOM,EQ,21-Jan-2010,186.9,186.85,189.75,184.15,185.3,185.15,186.85,3673061,6863.2499155,
RCOM,EQ,22-Jan-2010,185.15,183.7,184.7,176.45,181.6,181.55,181.5,4194634,7613.198626,

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-23 Thread Henrique Dallazuanna

Your url is wrong. is missing ".zip" in the end.

See the code again.

On Sat, Jan 23, 2010 at 6:37 AM, Velappan Periasamy  wrote:
>  cannot open: HTTP status was '404 Not Found' while running the
> following commands
>
> f <- tempfile()
> download.file("http://nseindia.com/content/equities/scripvol/datafiles/01-01-2010-TO-23-01-2010RCOMXN.csv";,
> f)
> myData <- read.csv(f)
>
>
> On 1/19/10, Henrique Dallazuanna  wrote:
>> Try this:
>>
>> f <- tempfile()
>> download.file("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";,
>> f)
>> myData <- read.csv(unzip(f))
>>
>> On Tue, Jan 19, 2010 at 2:56 PM, Velappan Periasamy 
>> wrote:
>>> How to unzip this file?.
>>>
 mydata <-
 unzip("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
>>> Warning message:
>>> In
>>> unzip("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
>>> :
>>>  error 1 in extracting from zip file

>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Henrique Dallazuanna
>> Curitiba-Paraná-Brasil
>> 25° 25' 40" S 49° 16' 22" O
>>
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-23 Thread Velappan Periasamy

The same link works and dowloads data while copying and pasteing  the
link in firebox address box.
the file is there and the server is active.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-23 Thread Velappan Periasamy

 cannot open: HTTP status was '404 Not Found' while running the
following commands

f <- tempfile()
download.file("http://nseindia.com/content/equities/scripvol/datafiles/01-01-2010-TO-23-01-2010RCOMXN.csv";,
f)
myData <- read.csv(f)


On 1/19/10, Henrique Dallazuanna  wrote:
> Try this:
>
> f <- tempfile()
> download.file("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";,
> f)
> myData <- read.csv(unzip(f))
>
> On Tue, Jan 19, 2010 at 2:56 PM, Velappan Periasamy 
> wrote:
>> How to unzip this file?.
>>
>>> mydata <-
>>> unzip("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
>> Warning message:
>> In
>> unzip("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
>> :
>>  error 1 in extracting from zip file
>>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation

2010-01-22 Thread Peter Rote


Thank you Don for the code, 

but I get the following error massage:

>  by(AlexETF,AlexETF$Industry,function(a) {filename = paste("C:/ab/",gsub("
> ","",a$Industry[1]),".txt",sep="")
+ print(filename)
+ write.table(a[,3,drop=FALSE],quote=FALSE,col.names=FALSE,row.names=FALSE) 
+ }
+  )  

[1] "C:/ab/Accident&HealthInsurance.txt"
Error in `[.data.frame`(a, , 3, drop = FALSE) : 
  undefined columns selected

Best,
Peter
-- 
View this message in context: 
http://n4.nabble.com/Data-Manipulation-tp1018249p1100168.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation

2010-01-22 Thread Don MacQueen


Does this example help?



 a <- matrix(letters[1:12], ncol=3)
 a

 [,1] [,2] [,3]
[1,] "a"  "e"  "i"
[2,] "b"  "f"  "j"
[3,] "c"  "g"  "k"
[4,] "d"  "h"  "l"


 write.table(a[,3,drop=FALSE],quote=FALSE,col.names=FALSE,row.names=FALSE)

i
j
k
l



At 4:11 PM -0800 1/21/10, Peter Rote wrote:

Thank you Dieter and Rolf,

I have solved the slash Problem, but I still struggling  with the output
files.

I have tried this
 by(AlexETF,AlexETF$Industry,function(a) {filename = paste("C:/ab/",gsub("
","",a$Industry[1]),".txt",sep="")
print(filename)
	write.table(a,file=filename,col.names = FALSE)
}

 )

and this

 by(AlexETF,AlexETF$Industry,function(a) {filename = paste("C:/ab/",gsub("
","",a$Industry[1]),".txt",sep="")
print(filename)
	write(as.character(a),file=filename)
}
 ) 



I want in each file just the ticker with out any quotations mark.

CMM
FMCN
IPG
MWW

Thanks in advance,
Peter

--
View this message in context: 
http://*n4.nabble.com/Data-Manipulation-tp1018249p1073567.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
--
Don MacQueen
Environmental Protection Department
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation

2010-01-21 Thread Peter Rote


Thank you Dieter and Rolf,

I have solved the slash Problem, but I still struggling  with the output
files.

I have tried this
 by(AlexETF,AlexETF$Industry,function(a) {filename = paste("C:/ab/",gsub("
","",a$Industry[1]),".txt",sep="")
print(filename)
write.table(a,file=filename,col.names = FALSE) 
}
 )

and this 

 by(AlexETF,AlexETF$Industry,function(a) {filename = paste("C:/ab/",gsub("
","",a$Industry[1]),".txt",sep="")
print(filename)
write(as.character(a),file=filename) 
}
 )  


I want in each file just the ticker with out any quotations mark.

CMM
FMCN
IPG
MWW 

Thanks in advance, 
Peter
 
-- 
View this message in context: 
http://n4.nabble.com/Data-Manipulation-tp1018249p1073567.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation

2010-01-20 Thread Dieter Menne

Peter Rote wrote:
> 
> 
> but i  still have a problem to write to file. The problem is the slash in
> file names (Aerospace/Defense Products & Services ). If i want  it to
> C:/ab/
> so "C:/ab/AdvertisingAgencies.txt" is ok but
> "C:/ab/Aerospace/Defense-MajorDiversified.txt" is not
> 
> 

As Rolf said, the slash is not legal in a file name, it is treated like a
backslash ("\\") when run under Windows. Use create.dir to created
Aerospace, or change the slash to something else.

Dieter

-- 
View this message in context: 
http://n4.nabble.com/Data-Manipulation-tp1018249p1049554.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation

2010-01-20 Thread Rolf Turner



A name such as ``Aerospace/Defense etc.'' is certainly not a legal
file name under unix-alike systems, and I suspect it would not
be even under Windoze.  Even if it is, you shouldn't use it!

Change the name to ``Aerospace-Defense Products & Services''
or something like that, for goodness sake.

cheers,

Rolf Turner


On 21/01/2010, at 2:19 PM, Peter Rote wrote:



Thank you Dieter,

but i  still have a problem to write to file. The problem is the  
slash in
file names (Aerospace/Defense Products & Services ). If i want  it  
to C:/ab/

so "C:/ab/AdvertisingAgencies.txt" is ok but
"C:/ab/Aerospace/Defense-MajorDiversified.txt" is not


head(AlexETF)

AlexETF.Industry AlexETF.Ticker
1Scientific & Technical Instruments A
2  Aluminum   AA
3 Business Services   AAC
4   Credit Services   AACC
5 Regional Airlines  AAI
6 Aerospace/Defense Products & Services AAII


by(AlexETF,AlexETF$Industry,function(a) {filename = paste(gsub("
","",a$Industry[1]),".txt",sep="")

+  print(filename)
+
+}
+ )
[1] "Accident&HealthInsurance.txt"
[1] "AdvertisingAgencies.txt"
[1] "Aerospace/Defense-MajorDiversified.txt"
[1] "Aerospace/DefenseProducts&Services.txt"
[1] "AgriculturalChemicals.txt"
[1] "AirDelivery&FreightServices.txt"



by(AlexETF,AlexETF$Industry,function(a) {filename = paste("C:/ 
ab/",gsub("

","",a$Industry[1]),".txt",sep="")

+  write.table(a,file=filename,col.names = FALSE)
+}
+ )
Error in file(file, ifelse(append, "a", "w")) :
  cannot open the connection
In addition: Warning message:
In file(file, ifelse(append, "a", "w")) :
  cannot open file 'C:/ab/Aerospace/Defense-MajorDiversified.txt':  
No such

file or directory



Thanks in advance,

Peter

--
View this message in context: http://n4.nabble.com/Data- 
Manipulation-tp1018249p1032029.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting- 
guide.html

and provide commented, minimal, self-contained, reproducible code.



##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation

2010-01-20 Thread Peter Rote


by the way how do i change the output

"1016" "Advertising Agencies" "CMM"
"1803" "Advertising Agencies" "FMCN"
"2427" "Advertising Agencies" "IPG"
"3093" "Advertising Agencies" "MWW"
"3372" "Advertising Agencies" "OMC"
"4809" "Advertising Agencies" "VCLK"
"4832" "Advertising Agencies" "VISN"
"5005" "Advertising Agencies" "WPPGY"
"5089" "Advertising Agencies" "XSEL"

to just

CMM
FMCN
IPG
MWW

Peter
-- 
View this message in context: 
http://n4.nabble.com/Data-Manipulation-tp1018249p1032753.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation

2010-01-20 Thread Peter Rote


Thank you Dieter, 

but i  still have a problem to write to file. The problem is the slash in
file names (Aerospace/Defense Products & Services ). If i want  it to C:/ab/
so "C:/ab/AdvertisingAgencies.txt" is ok but
"C:/ab/Aerospace/Defense-MajorDiversified.txt" is not

> head(AlexETF)
AlexETF.Industry AlexETF.Ticker
1Scientific & Technical Instruments A
2  Aluminum   AA
3 Business Services   AAC
4   Credit Services   AACC
5 Regional Airlines  AAI
6 Aerospace/Defense Products & Services AAII

> by(AlexETF,AlexETF$Industry,function(a) {filename = paste(gsub("
> ","",a$Industry[1]),".txt",sep="")
+  print(filename)
+  
+}
+ ) 
[1] "Accident&HealthInsurance.txt"
[1] "AdvertisingAgencies.txt"
[1] "Aerospace/Defense-MajorDiversified.txt"
[1] "Aerospace/DefenseProducts&Services.txt"
[1] "AgriculturalChemicals.txt"
[1] "AirDelivery&FreightServices.txt"



> by(AlexETF,AlexETF$Industry,function(a) {filename = paste("C:/ab/",gsub("
> ","",a$Industry[1]),".txt",sep="")
+  write.table(a,file=filename,col.names = FALSE) 
+}
+ ) 
Error in file(file, ifelse(append, "a", "w")) : 
  cannot open the connection
In addition: Warning message:
In file(file, ifelse(append, "a", "w")) :
  cannot open file 'C:/ab/Aerospace/Defense-MajorDiversified.txt': No such
file or directory



Thanks in advance,

Peter 

-- 
View this message in context: 
http://n4.nabble.com/Data-Manipulation-tp1018249p1032029.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation

2010-01-20 Thread Dieter Menne



Peter Rote wrote:
> 
> I would like to to group the Ticker by Industry and create file names from
> the
> Industry Factor  and export to a txt file.
> 
> I have tried the folowing 
> 
> ind=finvizAllexETF$Industry
> 
> ind is then  "Aluminum"  "Business Services" "Regional Airlines"
> 
> ind2=gsub(" " ,"",ind)
>  ind3
> [1] "Aluminum" "BusinessServices" "RegionalAirlines"
> 
>> for (i in 1:3) ind3[i]<- AllexETF$Ticker[AllexETF$Industry==ind2[i]]
> 
> Warning messages:
> 1: In ind3[i] <- finvizAllexETF$Ticker[AllexETF$Industry == ind2[i]] :
>   number of items to replace is not a multiple of replacement length
> 
> 

If this happens, try to do a 

finvizAllexETF$Ticker[AllexETF$Industry == ind2[i]] 

You will note that it returns not one, but many items, and assigning it to
ind[i] will fail. Sometimes, it helps to add a [1] at the end, but there is
another problem that these are factors and you want strings.

The example below shows on method:

set.seed(4711)
AlexETF = 
 data.frame(Industry=sample(c("Business Services", "Aluminium","Regional
Airlines"),10,TRUE),Price = rnorm(10,10))
by(AlexETF,AlexETF$Industry,function(a) {
 filename = paste(gsub(" ","",a$Industry[1]),".txt",sep="")
 print(filename)
 write.table(a,file=filename)
   }
)

 
Dieter

 





-- 
View this message in context: 
http://n4.nabble.com/Data-Manipulation-tp1018249p1018269.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data Manipulation

2010-01-20 Thread Peter Rote


Dear All,

I would like to to group the Ticker by Industry and create file names from
the
Industry Factor  and export to a txt file.

I have tried the folowing 

ind=finvizAllexETF$Industry

ind is then  "Aluminum"  "Business Services" "Regional Airlines"

ind2=gsub(" " ,"",ind)
 ind3
[1] "Aluminum" "BusinessServices" "RegionalAirlines"

> for (i in 1:3) ind3[i]<- AllexETF$Ticker[AllexETF$Industry==ind2[i]]

Warning messages:
1: In ind3[i] <- finvizAllexETF$Ticker[AllexETF$Industry == ind2[i]] :
  number of items to replace is not a multiple of replacement length


> str(AllexETF)
'data.frame':   5137 obs. of  11 variables:
 $ No.   : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Ticker: Factor w/ 5137 levels "A","AA","AAC",..: 1 2 3 4 5 6 7 8 9 10
...
 $ Company   : Factor w/ 5130 levels "012 Smile.Communications Ltd.",..: 127
158 33 437 141 148 459 25 23 87 ...
 $ Sector: Factor w/ 9 levels "Basic Materials",..: 8 1 7 4 7 6 4 7 6 7
...
 $ Industry  : Factor w/ 212 levels "Accident & Health Insurance",..: 175 8
27 43 160 4 105 168 77 16 ...
 $ Country   : Factor w/ 51 levels "Argentina","Australia",..: 51 51 8 51 51
51 51 51 51 51 ...
 $ Market.Cap: num  10614.9 15229.56 6.35 185.38 734.64 ...
 $ P.E   : num  NA NA NA 24.2 NA ...
 $ Price : num  30.43 15.63 0.78 6.06 5.46 ...
 $ Change: Factor w/ 1119 levels "","-0.01%","-0.03%",..: 250 114 645
573 109 645 379 10 402 63 ...
 $ Volume: int  3309434 33389060 7600 42396 4406265 0 4837 447195 75997
738875 ...

head(AllexETF)

  No. Ticker  Company   Sector  
   
Industry Country Market.Cap   P.E Price Change   Volume
1   1  AAgilent Technologies Inc.   TechnologyScientific
& Technical Instruments USA   10614.90NA 30.43 -2.31%  3309434
2   2 AA  Alcoa, Inc.  Basic Materials  
   
Aluminum USA   15229.56NA 15.63 -1.14% 33389060
3   3AACAbleauctions.com Inc. Services  
  
Business Services  Canada   6.35NA  0.78  0.00% 7600
4   4   AACC   Asset Acceptance Capital Corp.Financial  

Credit Services USA 185.38 24.24  6.06 -6.34%42396
5   5AAIAirTran Holdings Inc. Services  
  
Regional Airlines USA 734.64NA  5.46 -1.09%  4406265
6   6   AAII Alabama Aircraft Industries, Inc Industrial Goods
Aerospace/Defense Products & Services USA   5.58NA  1.35  0.00% 
  
0


Thanks in advance, 

Peter

-- 
View this message in context: 
http://n4.nabble.com/Data-Manipulation-tp1018249p1018249.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-19 Thread Henrique Dallazuanna

Try this:

f <- tempfile()
download.file("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";,
f)
myData <- read.csv(unzip(f))

On Tue, Jan 19, 2010 at 2:56 PM, Velappan Periasamy  wrote:
> How to unzip this file?.
>
>> mydata <- 
>> unzip("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
> Warning message:
> In 
> unzip("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
> :
>  error 1 in extracting from zip file
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-19 Thread Velappan Periasamy

How to unzip this file?.

> mydata <- 
> unzip("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
Warning message:
In 
unzip("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
:
  error 1 in extracting from zip file
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-19 Thread Gabor Grothendieck

If you need an example of this look at the yacasInstall function in this file:

http://ryacas.googlecode.com/svn/trunk/R/yacasInstall.R

from the Ryacas package.  It downloads, unzips and installs yacas and
associated files for Windows users.

On Tue, Jan 19, 2010 at 3:10 AM, Dieter Menne
 wrote:
>
>
> Velappan Periasamy wrote:
>>
>> I am not able to import zipped files from the following link.
>> How to get thw same in to R?.
>> mydata <-
>> read.csv("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
>>
>
> As Brian Ripley noted in
>
> http://markmail.org/message/7dsauipzagq5y36o
>
> you will have to download it first and then to unzip.
>
> Dieter
>
>
> --
> View this message in context: 
> http://n4.nabble.com/Data-import-export-zipped-files-from-URLs-tp1017287p1017326.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-19 Thread Duncan Temple Lang

Dieter Menne wrote:
> 
> Velappan Periasamy wrote:
>> I am not able to import zipped files from the following link.
>> How to get thw same in to R?.
>> mydata <-
>> read.csv("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
>>
> 
> As Brian Ripley noted in 
> 
> http://markmail.org/message/7dsauipzagq5y36o
> 
> you will have to download it first and then to unzip.

Well if downloading to disk first does need to be avoided, you can use
the RCurl and Rcompression packages to do the computations in memory:

library(RCurl)
ctnt = 
getURLContent("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)

library(Rcompression)
zz = zipArchive(ctnt)
names(zz)
txt = zz[[1]]
read.csv(textConnection(txt))

 D.

> 
> Dieter
> 
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data import export zipped files from URLs

2010-01-19 Thread Dieter Menne

Velappan Periasamy wrote:
> 
> I am not able to import zipped files from the following link.
> How to get thw same in to R?.
> mydata <-
> read.csv("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)
> 

As Brian Ripley noted in 

http://markmail.org/message/7dsauipzagq5y36o

you will have to download it first and then to unzip.

Dieter

-- 
View this message in context: 
http://n4.nabble.com/Data-import-export-zipped-files-from-URLs-tp1017287p1017326.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data import export zipped files from URLs

2010-01-18 Thread Velappan Periasamy

I am not able to import zipped files from the following link.
How to get thw same in to R?.
mydata <- 
read.csv("http://nseindia.com/content/historical/EQUITIES/2010/JAN/cm15JAN2010bhav.csv.zip";)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame names in sequence. please help!!!

2010-01-10 Thread Zoho


Thank you all. The 'list' works well, except makes a really big 'list', since
my data is 'huge'. But solves the problem anyway. Appreciate a lot! 


Barry Rowlingson wrote:
> 
> On Sun, Jan 10, 2010 at 7:16 AM, Berend Hasselman  wrote:
>>
>>
>>
>> Zoho wrote:
>>>
>>> I've been stuck with this problem for a whole afternoon. It's silly but
>>> totally pissed me off. I have a set of data frames with names in a
>>> sequence: df_1, df_2, df_3, ..., df_20. Now I want to access each data
>>> frame (read or write) in a for loop, in a way something like this:
>>>
>>> for (i in 1:20) {
>>>   df_i <- ##
>>>   length(which(df_i[,7]==1))
>>>   ##
>>> }
>>>
>>> I tried paste or cat ("df_", i, sep=""). But neither way works. Your
>>> help
>>> is highly appreciated!! Thanks in advance!
>>>
>>
>> df_1 <- data.frame(x1=3,x2=5)
>> df_2 <- data.frame(x1=2,x2=7)
>> df_3 <- data.frame(x1=-1,x2=1)
>>
>> for(k in 1:3){v <- paste("df_",k,sep=""); print(get(v))}
>> for(k in 1:3){v <- paste("df",k,sep="_"); print(get(v)[,2])}
>>
>> Have a look at get:
>>
>> ?get
> 
>  Or better still, have a look at making a *list* instead of a bunch of
> data frames with numbers in their names, then you can index in a
> sensible way without having to construct names with paste and get.
> Here's a list of data frames:
> 
>  L = list()
>  for(i in 1 :10){
>   L[[i]]=data.frame(x=runif(10))
> }
> 
>  Now you can loop over L[[i]]
> 
>  This has been asked a zillion times on R-help. Sure, if you've
> already mistakenly created 200 data frames then you need the paste/get
> solution, but don't make the same mistake twice. Use a list.
> 
> Barry
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://n4.nabble.com/data-frame-names-in-sequence-please-help-tp1010518p1010715.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame names in sequence

2010-01-10 Thread jim holtman

?get


for (i in 1:20) {
 df_i <- get(paste('df_', i, sep=''))
 length(which(df_i[,7]==1))
 ##
}

On Sat, Jan 9, 2010 at 7:57 PM, Zoho  wrote:

>
> I've been stuck with this problem for a whole afternoon. It's silly but
> totally pissed me off. I have a set of data frames with names in a
> sequence:
> df_1, df_2, df_3, ..., df_20. Now I want to access each data frame (read or
> write) in a for loop, in a way something like this:
>
> for (i in 1:20) {
>  df_i <- ##
>  length(which(df_i[,7]==1))
>  ##
> }
>
> I tried paste or cat ("df_", i, sep=""). But neither way works. Your help
> is
> highly appreciated!! Thanks in advance!
> --
> View this message in context:
> http://n4.nabble.com/data-frame-names-in-sequence-tp1010518p1010518.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame names in sequence. please help!!!

2010-01-10 Thread Barry Rowlingson

On Sun, Jan 10, 2010 at 7:16 AM, Berend Hasselman  wrote:
>
>
>
> Zoho wrote:
>>
>> I've been stuck with this problem for a whole afternoon. It's silly but
>> totally pissed me off. I have a set of data frames with names in a
>> sequence: df_1, df_2, df_3, ..., df_20. Now I want to access each data
>> frame (read or write) in a for loop, in a way something like this:
>>
>> for (i in 1:20) {
>>   df_i <- ##
>>   length(which(df_i[,7]==1))
>>   ##
>> }
>>
>> I tried paste or cat ("df_", i, sep=""). But neither way works. Your help
>> is highly appreciated!! Thanks in advance!
>>
>
> df_1 <- data.frame(x1=3,x2=5)
> df_2 <- data.frame(x1=2,x2=7)
> df_3 <- data.frame(x1=-1,x2=1)
>
> for(k in 1:3){v <- paste("df_",k,sep=""); print(get(v))}
> for(k in 1:3){v <- paste("df",k,sep="_"); print(get(v)[,2])}
>
> Have a look at get:
>
> ?get

 Or better still, have a look at making a *list* instead of a bunch of
data frames with numbers in their names, then you can index in a
sensible way without having to construct names with paste and get.
Here's a list of data frames:

 L = list()
 for(i in 1 :10){
  L[[i]]=data.frame(x=runif(10))
}

 Now you can loop over L[[i]]

 This has been asked a zillion times on R-help. Sure, if you've
already mistakenly created 200 data frames then you need the paste/get
solution, but don't make the same mistake twice. Use a list.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data frame names in sequence

2010-01-10 Thread Zoho


I've been stuck with this problem for a whole afternoon. It's silly but
totally pissed me off. I have a set of data frames with names in a sequence:
df_1, df_2, df_3, ..., df_20. Now I want to access each data frame (read or
write) in a for loop, in a way something like this:

for (i in 1:20) {
  df_i <- ##
  length(which(df_i[,7]==1))
  ##
}

I tried paste or cat ("df_", i, sep=""). But neither way works. Your help is
highly appreciated!! Thanks in advance!
-- 
View this message in context: 
http://n4.nabble.com/data-frame-names-in-sequence-tp1010518p1010518.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame names in sequence. please help!!!

2010-01-09 Thread Berend Hasselman




Zoho wrote:
> 
> I've been stuck with this problem for a whole afternoon. It's silly but
> totally pissed me off. I have a set of data frames with names in a
> sequence: df_1, df_2, df_3, ..., df_20. Now I want to access each data
> frame (read or write) in a for loop, in a way something like this:
> 
> for (i in 1:20) {
>   df_i <- ##
>   length(which(df_i[,7]==1))
>   ##
> }
> 
> I tried paste or cat ("df_", i, sep=""). But neither way works. Your help
> is highly appreciated!! Thanks in advance!
> 

df_1 <- data.frame(x1=3,x2=5)
df_2 <- data.frame(x1=2,x2=7)
df_3 <- data.frame(x1=-1,x2=1)

for(k in 1:3){v <- paste("df_",k,sep=""); print(get(v))}
for(k in 1:3){v <- paste("df",k,sep="_"); print(get(v)[,2])}

Have a look at get:

?get

Berend
-- 
View this message in context: 
http://n4.nabble.com/data-frame-names-in-sequence-please-help-tp1010518p1010585.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Frame Transpose

2010-01-06 Thread Uwe Ligges




On 06.01.2010 03:14, Noli Sicad wrote:

Hi John

Thanks for your reply. I think I was posting properly the problem.

Here are the error, R script and console errors below.

Thanks. Noli

~~~
The error:
~~
Error in data.frame(CROP_ID = x[1, 1], CROPTYPE = x[1, 2], name =
colnames(x)[4:5],  :
  subscript out of bounds



Probably x does not have at least 5 columns ...
Since we do not have the full data, we cannot see what exactly happens.

Best,
Uwe Ligges



~~~

I have a dynamic subscripts for the Period, as result of linear
programming (LP) model run. How I generalise this line. Right now it
has 3 index only.

x01=y[,1], x02=y[,2], x03=y[,3])

This is sample the data.

PERIOD
1
1
1
1
2
2
2
2
3
3
3
4
4
5
5
5
5
5
6
6
6
6
6
6
7
8
9
10
10
10
10



R  script:

harvest.dat<- read.dbf('C:\\Down2\\R_forestmgt\\Carbon\\forest_cut_m.dbf')

names(harvest.dat) = c("CROP_ID", "CROPTYPE", "PERIOD","CUT_AGE", "AREA_CUT")

# Transpose 5 columns

fn<- function(x) {
  y<- t(x[,4:5])
  data.frame( CROP_ID=x[1,1], CROPTYPE=x[1,2], name=colnames(x)[4:5],
x01=y[,1], x02=y[,2], x03=y[,3])
  }

harvest.dat<- do.call( "rbind",
lapply(split(harvest.dat,list(harvest.dat$CROP_ID,harvest.dat$CROPTYPE)),fn)
)

write.csv(harvest.dat, "forest_cut3.csv")
  ~

Scite console with r package
~

Rscript --vanilla --slave 
"C:\Down2\R_forestmgt\Carbon\ForestCarbon_1_F_Clean7_transpose.R"

[1] "C:/Down2/R_forestmgt/Carbon"
Loading required package: foreign
Loading required package: sp
Loading required package: methods
Loading required package: lattice
Warning messages:
1: package 'maptools' was built under R version 2.10.1
2: package 'foreign' was built under R version 2.10.1
3: package 'sp' was built under R version 2.10.1
Error in data.frame(CROP_ID = x[1, 1], CROPTYPE = x[1, 2], name =
colnames(x)[4:5],  :
  subscript out of bounds
Calls: do.call ->  lapply ->  FUN ->  data.frame
Execution halted

Exit code: 1Time: 2.128

~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Frame Transpose

2010-01-05 Thread Noli Sicad

Hi John

Thanks for your reply. I think I was posting properly the problem.

Here are the error, R script and console errors below.

Thanks. Noli

~~~
The error:
~~
Error in data.frame(CROP_ID = x[1, 1], CROPTYPE = x[1, 2], name =
colnames(x)[4:5],  :
 subscript out of bounds
~~~

I have a dynamic subscripts for the Period, as result of linear
programming (LP) model run. How I generalise this line. Right now it
has 3 index only.

x01=y[,1], x02=y[,2], x03=y[,3])

This is sample the data.

PERIOD
1
1
1
1
2
2
2
2
3
3
3
4
4
5
5
5
5
5
6
6
6
6
6
6
7
8
9
10
10
10
10



R  script:

harvest.dat <- read.dbf('C:\\Down2\\R_forestmgt\\Carbon\\forest_cut_m.dbf')

names(harvest.dat) = c("CROP_ID", "CROPTYPE", "PERIOD","CUT_AGE", "AREA_CUT")

# Transpose 5 columns

fn <- function(x) {
 y <- t(x[,4:5])
 data.frame( CROP_ID=x[1,1], CROPTYPE=x[1,2], name=colnames(x)[4:5],
x01=y[,1], x02=y[,2], x03=y[,3])
 }

harvest.dat <- do.call( "rbind",
lapply(split(harvest.dat,list(harvest.dat$CROP_ID,harvest.dat$CROPTYPE)),fn)
)

write.csv(harvest.dat, "forest_cut3.csv")
 ~

Scite console with r package
~
>Rscript --vanilla --slave 
>"C:\Down2\R_forestmgt\Carbon\ForestCarbon_1_F_Clean7_transpose.R"
[1] "C:/Down2/R_forestmgt/Carbon"
Loading required package: foreign
Loading required package: sp
Loading required package: methods
Loading required package: lattice
Warning messages:
1: package 'maptools' was built under R version 2.10.1
2: package 'foreign' was built under R version 2.10.1
3: package 'sp' was built under R version 2.10.1
Error in data.frame(CROP_ID = x[1, 1], CROPTYPE = x[1, 2], name =
colnames(x)[4:5],  :
 subscript out of bounds
Calls: do.call -> lapply -> FUN -> data.frame
Execution halted
>Exit code: 1Time: 2.128
~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data replacement

2010-01-05 Thread Lisa


Thank you for your kind help. Your R script works well.

Lisa



Dieter Menne wrote:
> 
> 
> 
> Lisa wrote:
>> 
>> I have a dataset that looks like this: 
>> 
>>> data
>>  idcode1code2 
>> 1 114 
>> 2 123 
>> 3 244 
>> ..
>> 
>> I want to change some numbers in the columns of “code1” and “code2” based
>> on “indx” as below
>> 
>>> indx
>> [[1]]
>> code 
>> 1  1 
>> 2  3 
>> 3  4 
>> For example,  for the first ten records (rows) of my dataset, I want to
>> change 2 to 3, 3 to 4, 4 to 6, and 5 to 8 in both “code1” and “code2”,
>> while for the last ten records, I want to change 3 to 4 and 4 to 6.
>> 
>> 
> 
> You might check for "recode", for example in package car, or for
> "transform". You could also do it the quick and dirty way, good to learn
> indexing. Be careful if you have NA in your data, or data out of the
> recode range.
> 
> Dieter
> 
> 
> data = data.frame(code1=sample(1:5,10,TRUE),code2=sample(1:5,10,TRUE))
> data =
> rbind(data,data.frame(code1=sample(1:4,10,TRUE),code2=sample(1:4,10,TRUE)))
> 
> # The recode table as  in your example
> #indx = list(data.frame(code=c(1,3,4,6,8)),data.frame(code=c(1,2,4,6)))
> 
> #easier to read
> recode1 = c(1,3,4,6,8)
> recode2 = c(1,2,4,6)
> 
> data$code1T[1:10] = recode1[data$code1[1:10]]
> data$code2T[1:10] = recode1[data$code2[1:10]]
> 
> data$code1T[11:20] = recode2[data$code1[11:20]]
> data$code2T[11:20] = recode2[data$code2[11:20]]
> 
> 
> 
> 
> 

-- 
View this message in context: 
http://n4.nabble.com/Data-replacement-tp999060p999342.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data replacement

2010-01-05 Thread Dieter Menne




Lisa wrote:
> 
> I have a dataset that looks like this: 
> 
>> data
>  idcode1code2 
> 1 114 
> 2 123 
> 3 244 
> ..
> 
> I want to change some numbers in the columns of “code1” and “code2” based
> on “indx” as below
> 
>> indx
> [[1]]
> code 
> 1  1 
> 2  3 
> 3  4 
> For example,  for the first ten records (rows) of my dataset, I want to
> change 2 to 3, 3 to 4, 4 to 6, and 5 to 8 in both “code1” and “code2”,
> while for the last ten records, I want to change 3 to 4 and 4 to 6.
> 
> 

You might check for "recode", for example in package car, or for
"transform". You could also do it the quick and dirty way, good to learn
indexing. Be careful if you have NA in your data, or data out of the recode
range.

Dieter


data = data.frame(code1=sample(1:5,10,TRUE),code2=sample(1:5,10,TRUE))
data =
rbind(data,data.frame(code1=sample(1:4,10,TRUE),code2=sample(1:4,10,TRUE)))

# The recode table as  in your example
#indx = list(data.frame(code=c(1,3,4,6,8)),data.frame(code=c(1,2,4,6)))

#easier to read
recode1 = c(1,3,4,6,8)
recode2 = c(1,2,4,6)

data$code1T[1:10] = recode1[data$code1[1:10]]
data$code2T[1:10] = recode1[data$code2[1:10]]

data$code1T[11:20] = recode2[data$code1[11:20]]
data$code2T[11:20] = recode2[data$code2[11:20]]




-- 
View this message in context: 
http://n4.nabble.com/Data-replacement-tp999060p999176.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Frame Transpose

2010-01-05 Thread John Kane

Well, if nothing else, you have missing comma.  :)

x01=y[,1]), x01=y[,1], x02=y[,2], x03=y[,3]
  --
> 
> fn <- function(x) {
>   y <- t(x[,2])
>   data.frame( Croptype=x[1,1], Period =x[1,2],
> name=colnames(x)[2],
> x01=y[,1])x01=y[,1], x02=y[,2], x03=y[,3] }
> <---Problem
> here
> 
> m <- do.call( "rbind",
> lapply(split(m,list(m$Period,m$Croptype)),fn) )
> 
> m <- m[order(m$Period,m$Croptype),]
> 
> 
> I think I having a problem in here: x01=y[,1])x01=y[,1],
> x02=y[,2],
> x03=y[,3]. how to address with my data. I have variable
> Period.
> 
> based on this 
> http://www.mail-archive.com/r-h...@stat.math.ethz.ch/msg09264.html
> 
> P_ID Croptype  Period  Ini_Age  Area_Cut
> 83      SORI    1   
>    31      528.2465512
> 84      SORI    1   
>    32      74.55179899
> 85      SORI    1   
>    33      72.45778618
> 86      SORI    1   
>    34      139.5272947
> 82      SORI    2   
>    28      1.711642933
> 83      SORI    2   
>    29      2.50071
> 84      SORI    2   
>    30      432.5139327
> 93      ORM    2   
>    35      316.8422545
> 62      OTRM    3   
>    30      64.60526438
> 82      SORI    3   
>    27      26.93674606
> 3       SORM    3 
>      35      223.3658345
> 82      SORI    4   
>    26      2.50071
> 4       SORM    4 
>      34      1008.643
> 5       OTRI    5 
>      25      32.42603214
> 5       OTRM    5 
>      29      65.9031344
> 5       SORM    5 
>      32      223.1489321
> 5       SORM    5 
>      33      72.59203041
> 5       SORM    5 
>      35      222.8402746
> 6       OTRI    6 
>      22      2.49851
> 6       OTRI    6 
>      23      3.374626509
> 6       OTRI    6 
>      24      96.13462257
> 6       OTRM    6 
>      26      830.7463641
> 6       OTRM    6 
>      27      731.6228643
> 6       OTRM    6 
>      28      16.3519762
> 7       OTRM    7 
>      26      1636.5693
> 8       OTRM    8 
>      26      553.0050146
> 9       OTRM    9 
>      26      894.414033
> 10      OTRM    10   
>   24      38.72597099
> 10      OTRM    10   
>   25      308.6452707
> 10      OTRM    10   
>   26      786.1761969
> 10      SORM    10   
>   31      235.8360136
> 
> To this.
> 
> P_ID Croptype P1        P2   
>     P3        P4   
>    P5        P6 
>   P7
>   P8        P9     
>   P10
> 83      SORI    31
> 84      SORI    32
> 85      SORI    33
> 86      SORI    34
> 82      SORI       
>     28
> 83      SORI       
>     29
> 84      SORI       
>     30
> 93      SORM       
>     35
> 62      OTRM       
>             30
> 82      SORI       
>             27
> 3       SORM     
>               35
> 82      SORI       
>                
>     26
> 4       SORM     
>                
>       34
> 5       OTRI     
>                
>               25
> 5       OTRM     
>                
>               29
> 5       SORM     
>                
>               32
> 5       SORM     
>                
>               33
> 5       SORM     
>                
>               35
> 6       OTRI     
>                
>                
>       22
> 6       OTRI     
>                
>                
>       23
> 6       OTRI     
>                
>                
>       24
> 6       OTRM     
>                
>                
>       26
> 6       OTRM     
>                
>                
>       27
> 6       OTRM     
>                
>                
>       28
> 7       OTRM     
>                
>                
>               26
> 8       OTRM     
>                
>                
>                
>       26
> 9       OTRM
> 
> Thanks in advance. Noli
> 
> __
> R-help@r-project.org
> mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.
> 


  __
Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your 
favourite sites. Download it now
http://ca.toolbar.yahoo.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data replacement

2010-01-05 Thread Lisa


Dear all,

I have a question and need your help.

I have a dataset that looks like this: 

> data
 idcode1code2 
1 114 
2 123 
3 244 
4 315 
5 324 
6 411 
7 434 
8 643 
9 622 
10752 
11114 
12132 
13344 
14414 
15432 
16511 
17543 
18714 
19723 
20811 

I want to change some numbers in the columns of “code1” and “code2” based on
“indx” as below

> indx
[[1]]
code 
1  1 
2  3 
3  4 
4  6 
5  8 
[[2]]
code 
1  1 
2  2 
3  4 
4  6 

For example,  for the first ten records (rows) of my dataset, I want to
change 2 to 3, 3 to 4, 4 to 6, and 5 to 8 in both “code1” and “code2”, while
for the last ten records, I want to change 3 to 4 and 4 to 6.

Can anybody please help how to get this done? Thanks a lot in advance

Lisa

-- 
View this message in context: 
http://n4.nabble.com/Data-replacement-tp999060p999060.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data Frame Transpose

2010-01-05 Thread Noli Sicad

Hi,

forests <- read.csv("C:\\Down2\\R_forestmgt\\forest_cut-Age.csv")

m <- forests

fn <- function(x) {
  y <- t(x[,2])
  data.frame( Croptype=x[1,1], Period =x[1,2], name=colnames(x)[2],
x01=y[,1])x01=y[,1], x02=y[,2], x03=y[,3] } <---Problem
here

m <- do.call( "rbind", lapply(split(m,list(m$Period,m$Croptype)),fn) )

m <- m[order(m$Period,m$Croptype),]


I think I having a problem in here: x01=y[,1])x01=y[,1], x02=y[,2],
x03=y[,3]. how to address with my data. I have variable Period.

based on this http://www.mail-archive.com/r-h...@stat.math.ethz.ch/msg09264.html

P_ID Croptype  Period  Ini_Age  Area_Cut
83  SORI1   31  528.2465512
84  SORI1   32  74.55179899
85  SORI1   33  72.45778618
86  SORI1   34  139.5272947
82  SORI2   28  1.711642933
83  SORI2   29  2.50071
84  SORI2   30  432.5139327
93  ORM2   35  316.8422545
62  OTRM3   30  64.60526438
82  SORI3   27  26.93674606
3   SORM3   35  223.3658345
82  SORI4   26  2.50071
4   SORM4   34  1008.643
5   OTRI5   25  32.42603214
5   OTRM5   29  65.9031344
5   SORM5   32  223.1489321
5   SORM5   33  72.59203041
5   SORM5   35  222.8402746
6   OTRI6   22  2.49851
6   OTRI6   23  3.374626509
6   OTRI6   24  96.13462257
6   OTRM6   26  830.7463641
6   OTRM6   27  731.6228643
6   OTRM6   28  16.3519762
7   OTRM7   26  1636.5693
8   OTRM8   26  553.0050146
9   OTRM9   26  894.414033
10  OTRM10  24  38.72597099
10  OTRM10  25  308.6452707
10  OTRM10  26  786.1761969
10  SORM10  31  235.8360136

To this.

P_ID Croptype P1P2P3P4   P5P6P7
  P8P9P10
83  SORI31
84  SORI32
85  SORI33
86  SORI34
82  SORI28
83  SORI29
84  SORI30
93  SORM35
62  OTRM30
82  SORI27
3   SORM35
82  SORI26
4   SORM34
5   OTRI25
5   OTRM29
5   SORM32
5   SORM33
5   SORM35
6   OTRI22
6   OTRI23
6   OTRI24
6   OTRM26
6   OTRM27
6   OTRM28
7   OTRM26
8   OTRM26
9   OTRM

Thanks in advance. Noli

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data download from metastock into r-software

2009-12-17 Thread SNV Krishna

Hi All,

is there a way to download data from metastock to R-software. most of my data 
is in date,OHLC format downloaded from reuters to metastock software in my 
local pc. 

many thanks for the help.,

krishna

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data

2009-12-11 Thread Jorge Ivan Velez

Hi Jose,

Here is a suggestion using tapply():

R> x <- read.table(textConnection("Pepe 2
+ Pepe 3
+ Pepe 4
+ Jose 2
+ Jose 5
+ Manuel 4
+ Manuel 2"), header = FALSE)
R> closeAllConnections()
R> x
   V1 V2
1   Pepe  2
2   Pepe  3
3   Pepe  4
4   Jose  2
5   Jose  5
6 Manuel  4
7 Manuel  2
R>
R>
R> with(x, tapply(V2, V1, sum))
   Jose Manuel   Pepe
  7  6  9

HTH,
Jorge


On 12/11/2009 4:38 PM, Jose Narillos de Santos wrote:
> Hi all,
>
> Imagine I have a matrix and the first colum is a list that repeats the same
> names, I want to sum the second column on each unique name on first column.
>
> Imagine this:
>
> Pepe 2
> Pepe 3
> Pepe 4
> Jose 2
> Jose 5
> Manuel 4
> Manuel 2
>
> I want to make a new matrix that calculates and recognizes that there are 3
> different names ans sum second column. But a priori I don´t know the list of
> the different names:
>
> In my example
>
> Pepe 9
> Jose 7
> Manuel 6
>
> I´m trying to use something like sapply or apply but I can find the key...
>
> Can anyone help or guide me ?
>
> Thanks in advance¡¡¡
>
>   [[alternative HTML version deleted]]
>
>
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data

2009-12-11 Thread David Winsemius



On Dec 11, 2009, at 4:38 PM, Jose Narillos de Santos wrote:


Hi all,

Imagine I have a matrix and the first colum is a list that repeats  
the same
names, I want to sum the second column on each unique name on first  
column.


Imagine this:

Pepe 2
Pepe 3
Pepe 4
Jose 2
Jose 5
Manuel 4
Manuel 2


?tapply# or one of its derivative fucntions like by or aggregate


I want to make a new matrix that calculates and recognizes that  
there are 3
different names ans sum second column. But a priori I don´t know the  
list of

the different names:

In my example

Something like:

tapply(valuecol, namecol, sum)

(Untested.)



Pepe 9
Jose 7
Manuel 6

I´m trying to use something like sapply or apply but I can find the  
key...


Can anyone help or guide me ?

Thanks in advance¡¡¡

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data

2009-12-11 Thread Jose Narillos de Santos

Hi all,

Imagine I have a matrix and the first colum is a list that repeats the same
names, I want to sum the second column on each unique name on first column.

Imagine this:

Pepe 2
Pepe 3
Pepe 4
Jose 2
Jose 5
Manuel 4
Manuel 2

I want to make a new matrix that calculates and recognizes that there are 3
different names ans sum second column. But a priori I don´t know the list of
the different names:

In my example

Pepe 9
Jose 7
Manuel 6

I´m trying to use something like sapply or apply but I can find the key...

Can anyone help or guide me ?

Thanks in advance¡¡¡

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data manipulation/subsetting and relation matrix

2009-12-08 Thread jim holtman

try this:

myDat <- read.table(textConnection("group id
1 101
1 201
1 301
2 401
2 501
2 601
3 701
3 801
3 901"),header=TRUE)
closeAllConnections()
corr_mat <-as.matrix(read.table(textConnection("1 1   .5  0   0   0   0
0   0   0
2 .5   1  0   0   0   0   0   0   0
3 00  1.0   0   0   0   0   0   0
4 00  0   1   .5  .5  0   0   0
5 00  0   .5  1.5  0   0   0
6 00  0   .5  .5   1 00   0
7 00  0   00   0  1   0  0
8 0   0   0   00   0   0  1  .5
9 0   0   0   0   00   0  .5 1"),header=FALSE))
closeAllConnections()
corr_mat <- corr_mat[,-1]
colnames(corr_mat) <- myDat$id
rownames(corr_mat) <- myDat$id
# split out the groups
groups <- split(as.character(myDat$id), myDat$group)
# process each subgroup
result <- lapply(groups, function(.grp){
subgroup <- corr_mat[.grp, .grp]
output <- NULL
# zero the diag
diag(subgroup) <- 0
same <- apply(subgroup, 1, function(x) any(x != 0))
if (any(same)){  # some match, choose one
output <- sample(same[same], 1)
}
if (any(!same)){  # get all that don't correlate
output <- c(output, same[!same])
}
output
})
# output as matrix
do.call(rbind, lapply(names(result), function(x) cbind(x,
names(result[[x]]



On Mon, Dec 7, 2009 at 7:38 PM, Juliet Hannah wrote:

> Hi List,
>
> Here is some example data.
>
> myDat <- read.table(textConnection("group id
> 1 101
> 1 201
> 1 301
> 2 401
> 2 501
> 2 601
> 3 701
> 3 801
> 3 901"),header=TRUE)
> closeAllConnections()
>
> corr_mat <-read.table(textConnection("1 1   .5  0   0   0   0   0   0   0
> 2 .5   1  0   0   0   0   0   0   0
> 3 00  1.0   0   0   0   0   0   0
> 4 00  0   1   .5  .5  0   0   0
> 5 00  0   .5  1.5  0   0   0
> 6 00  0   .5  .5   1 00   0
> 7 00  0   00   0  1   0  0
> 8 0   0   0   00   0   0  1  .5
> 9 0   0   0   0   00   0  .5 1"),header=FALSE)
> closeAllConnections()
>
> corr_mat <- corr_mat[,-1]
> colnames(corr_mat) <- myDat$id
> rownames(corr_mat) <- myDat$id
>
> I need to subset this data such that observations within a group are not
> related, which is indicated by a 0 in corr_mat.
>
> For example, within group 1, 101 and 201 are related, so one of these
> has to be selected, say
> 101. 301 is not related to 101 or 201, so the final set for group 1
> consists of 101 and 301. There will always be at least 2 members in
> each group. I need to carry this task on all groups.
>
> One possible final data set looks like:
>
>  group  id
> 1 1 101
> 3 1 301
> 4 2 401
> 7 3 701
> 8 3 801
>
> Any suggestions? Thanks!
>
> Juliet
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data manipulation/subsetting and relation matrix

2009-12-07 Thread Juliet Hannah

Hi List,

Here is some example data.

myDat <- read.table(textConnection("group id
1 101
1 201
1 301
2 401
2 501
2 601
3 701
3 801
3 901"),header=TRUE)
closeAllConnections()

corr_mat <-read.table(textConnection("1 1   .5  0   0   0   0   0   0   0
2 .5   1  0   0   0   0   0   0   0
3 00  1.0   0   0   0   0   0   0
4 00  0   1   .5  .5  0   0   0
5 00  0   .5  1.5  0   0   0
6 00  0   .5  .5   1 00   0
7 00  0   00   0  1   0  0
8 0   0   0   00   0   0  1  .5
9 0   0   0   0   00   0  .5 1"),header=FALSE)
closeAllConnections()

corr_mat <- corr_mat[,-1]
colnames(corr_mat) <- myDat$id
rownames(corr_mat) <- myDat$id

I need to subset this data such that observations within a group are not
related, which is indicated by a 0 in corr_mat.

For example, within group 1, 101 and 201 are related, so one of these
has to be selected, say
101. 301 is not related to 101 or 201, so the final set for group 1
consists of 101 and 301. There will always be at least 2 members in
each group. I need to carry this task on all groups.

One possible final data set looks like:

  group  id
1 1 101
3 1 301
4 2 401
7 3 701
8 3 801

Any suggestions? Thanks!

Juliet

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation Question

2009-12-04 Thread Gray Calhoun

This is probably far more discussion than the question warranted, but...

On Thu, Dec 3, 2009 at 11:14 PM, David Winsemius  wrote:
>
> On Dec 3, 2009, at 10:52 PM, Gray Calhoun wrote:
>
>> The data import/export manual can elaborate on a lot of these; this is
>> all straightforward, although many people would prefer to use a
>> relational database for some of the things you mentioned.
>
> See Wickham's pithy response to this.

Sure.  My (indirect) point is that representing query results as
separate files is usually not the right approach, regardless of
statistical language/package one uses.

>
>> I'm not
>> aware of a "goto" command in R, though (although I could be wrong).
>
> In fairness to the OP, he did not ask if there were a go-to construct, but
> rather whether there were a "gosub" construct that supported "modular
> programming". My response would have been that calling modular functions
> (i.e., subroutines with defined arguments) is fundamental to R and the key
> to understanding how to use it with grace and efficiency. I would say that
> the concept of functional programming is to a much greater extent supported
> by R than by SAS, whose datastep mechanisms (as I remember them from earlier
> incarnation) in no way supported modular programming. I suspect that S and R
> arose precisely because of the mental straightjackets imposed by SAS.

>From the original: "Goto Start until File 1 Done."  But, yes, probably
unfair and certainly less informative than your response.

>
> --
> David.
>
>
>> --Gray
>>
>> On Thu, Dec 3, 2009 at 1:52 PM, John Filben  wrote:
>>>
>>> Can R support data manipulation programming that is available in the SAS
>>> datastep?  Specifically, can R support the following:
>>> -          Read multiple dataset one record at a time and compare values
>>> from each; then base on if-then logic write to multiple output files
>>> -          Load a lookup table and then process a different file; based
>>> on if-then logic, access and lookup values in the table
>>> -          Support modular “gosub”programming
>>> -          Sort files
>>> -          Date math and conversions
>>> -          Would it be able to support the following type of logic:
>>> o   Start
>>> §  Read Record from File 1
>>> §  Read Record from File 2
>>> §  Match
>>> ·         If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A
>>> ·         If Key 1 = Key 2, Write to output file B
>>> ·         If Key 1 <> Key 2 and Key 1 > Key 2, Write to output file C§
>>>  Goto Start until File 1 Done
>>>  John Filben
>>> Cell Phone - 773.401.2822
>>> Email - johnfil...@yahoo.com
>>>
>>>
>>>
>>>       [[alternative HTML version deleted]]
>>>
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation Question

2009-12-04 Thread Barry Rowlingson

On Thu, Dec 3, 2009 at 9:52 PM, John Filben  wrote:
> Can R support data manipulation programming that is available in the SAS 
> datastep?  Specifically, can R support the following:
> -  Read multiple dataset one record at a time and compare values from 
> each; then base on if-then logic write to multiple output files
> -  Load a lookup table and then process a different file; based on 
> if-then logic, access and lookup values in the table
> -  Support modular “gosub”programming
> -  Sort files
> -  Date math and conversions
> -  Would it be able to support the following type of logic:
> o   Start
> §  Read Record from File 1
> §  Read Record from File 2
> §  Match
> · If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A
> · If Key 1 = Key 2, Write to output file B
> · If Key 1 <> Key 2 and Key 1 > Key 2, Write to output file C§  Goto 
> Start until File 1 Done
>  John Filben

I'll expand on Hadley Wickham's "Yes", to say "Yes, and it wouldn't be
much of a 'system for statistical computation and graphics' if it
couldn't do that".

Remember R uses the 'S' and C programming languages and is Open
Source. If it _cant_ do something you want it to do, you can write
code that does it. Like the date math and conversions. Originally,
maybe wy back in R version 0.something, it didn't have that. But
someone wrote it, and wisely contributed it, and the community saw
that it was good. And now we have date math and conversions. And
nobody has to write any date math or conversion codes ever again.

  Now tell me how to get something into the SAS core code.

Barry

P.S. I see a very obvious optimisation you can do on this line:

  If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A

but maybe that's some kind of weird SASism

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation Question

2009-12-03 Thread David Winsemius



On Dec 3, 2009, at 10:52 PM, Gray Calhoun wrote:


The data import/export manual can elaborate on a lot of these; this is
all straightforward, although many people would prefer to use a
relational database for some of the things you mentioned.


See Wickham's pithy response to this.


I'm not
aware of a "goto" command in R, though (although I could be wrong).


In fairness to the OP, he did not ask if there were a go-to construct,  
but rather whether there were a "gosub" construct that supported  
"modular programming". My response would have been that calling  
modular functions (i.e., subroutines with defined arguments) is  
fundamental to R and the key to understanding how to use it with grace  
and efficiency. I would say that the concept of functional programming  
is to a much greater extent supported by R than by SAS, whose datastep  
mechanisms (as I remember them from earlier incarnation) in no way  
supported modular programming. I suspect that S and R arose precisely  
because of the mental straightjackets imposed by SAS.


--
David.



--Gray

On Thu, Dec 3, 2009 at 1:52 PM, John Filben   
wrote:
Can R support data manipulation programming that is available in  
the SAS datastep?  Specifically, can R support the following:
-  Read multiple dataset one record at a time and compare  
values from each; then base on if-then logic write to multiple  
output files
-  Load a lookup table and then process a different file;  
based on if-then logic, access and lookup values in the table

-  Support modular “gosub”programming
-  Sort files
-  Date math and conversions
-  Would it be able to support the following type of logic:
o   Start
§  Read Record from File 1
§  Read Record from File 2
§  Match
· If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A
· If Key 1 = Key 2, Write to output file B
· If Key 1 <> Key 2 and Key 1 > Key 2, Write to output file  
C§  Goto Start until File 1 Done

 John Filben
Cell Phone - 773.401.2822
Email - johnfil...@yahoo.com



   [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation Question

2009-12-03 Thread Gray Calhoun

The data import/export manual can elaborate on a lot of these; this is
all straightforward, although many people would prefer to use a
relational database for some of the things you mentioned.  I'm not
aware of a "goto" command in R, though (although I could be wrong).
--Gray

On Thu, Dec 3, 2009 at 1:52 PM, John Filben  wrote:
> Can R support data manipulation programming that is available in the SAS 
> datastep?  Specifically, can R support the following:
> -  Read multiple dataset one record at a time and compare values from 
> each; then base on if-then logic write to multiple output files
> -  Load a lookup table and then process a different file; based on 
> if-then logic, access and lookup values in the table
> -  Support modular “gosub”programming
> -  Sort files
> -  Date math and conversions
> -  Would it be able to support the following type of logic:
> o   Start
> §  Read Record from File 1
> §  Read Record from File 2
> §  Match
> · If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A
> · If Key 1 = Key 2, Write to output file B
> · If Key 1 <> Key 2 and Key 1 > Key 2, Write to output file C§  Goto 
> Start until File 1 Done
>  John Filben
> Cell Phone - 773.401.2822
> Email - johnfil...@yahoo.com
>
>
>
>        [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation Question

2009-12-03 Thread Jason Morgan

Please refrain from posting HTML. The results can be incomprehensible:

On 2009.12.03 13:52:09, John Filben wrote:
> Can R support data manipulation programming that is available in the SAS 
> datastep??? Specifically, can R support the following:
> -?? Read multiple dataset one record at a time and compare 
> values from each; then base on if-then logic write to multiple output files
> -?? Load a lookup table and then process a different file; 
> based on if-then logic, access and lookup values in the table
> -?? Support modular ???gosub???programming
> -?? Sort files
> -?? Date math and conversions
> -?? Would it be able to support the following type of logic:
> o Start
>  Read Record from File 1
>  Read Record from File 2
>  Match
> ?? If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A
> ?? If Key 1 = Key 2, Write to output file B
> ?? If Key 1 <> Key 2 and Key 1 > Key 2, Write to output file 
> C Goto Start until File 1 Done
> ??John Filben
> Cell Phone - 773.401.2822
> Email - johnfil...@yahoo.com 
>   
>   [[alternative HTML version deleted]]

-- 
Jason W. Morgan
Graduate Student
Department of Political Science
*The Ohio State University*
154 North Oval Mall
Columbus, Ohio 43210

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Manipulation Question

2009-12-03 Thread hadley wickham

On Thu, Dec 3, 2009 at 3:52 PM, John Filben  wrote:
> Can R support data manipulation programming that is available in the SAS 
> datastep?  Specifically, can R support the following:
> -  Read multiple dataset one record at a time and compare values from 
> each; then base on if-then logic write to multiple output files
> -  Load a lookup table and then process a different file; based on 
> if-then logic, access and lookup values in the table
> -  Support modular “gosub”programming
> -  Sort files
> -  Date math and conversions
> -  Would it be able to support the following type of logic:
> o   Start
> §  Read Record from File 1
> §  Read Record from File 2
> §  Match
> · If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A
> · If Key 1 = Key 2, Write to output file B
> · If Key 1 <> Key 2 and Key 1 > Key 2, Write to output file C§  Goto 
> Start until File 1 Done

Yes.

Hadley


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data Manipulation Question

2009-12-03 Thread John Filben

Can R support data manipulation programming that is available in the SAS 
datastep?Â  Specifically, can R support the following:
-Â Â Â Â Â Â Â Â Â  Read multiple dataset one record at a time and compare 
values from each; then base on if-then logic write to multiple output files
-Â Â Â Â Â Â Â Â Â  Load a lookup table and then process a different file; 
based on if-then logic, access and lookup values in the table
-Â Â Â Â Â Â Â Â Â  Support modular âgosubâprogramming
-Â Â Â Â Â Â Â Â Â  Sort files
-Â Â Â Â Â Â Â Â Â  Date math and conversions
-Â Â Â Â Â Â Â Â Â  Would it be able to support the following type of logic:
oÂ Â  Start
Â§Â  Read Record from File 1
Â§Â  Read Record from File 2
Â§Â  Match
Â·Â Â Â Â Â Â Â Â  If Key 1 <> Key 2 and Key 1 < Key 2, Write to output file A
Â·Â Â Â Â Â Â Â Â  If Key 1 = Key 2, Write to output file B
Â·Â Â Â Â Â Â Â Â  If Key 1 <> Key 2 and Key 1 > Key 2, Write to output file 
CÂ§Â  Goto Start until File 1 Done
Â John Filben
Cell Phone - 773.401.2822
Email - johnfil...@yahoo.com 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] Deducer: An R data analysis GUI

2009-12-03 Thread ian . fellows

Announcing a new version of Deducer:

Deducer 0.2-1 is an intuitive, cross-platform graphical data analysis
system. It uses menus and dialogs to guide the user efficiently through
the data manipulation and analysis process, and has an excel like
spreadsheet for easy data frame visualization and editing. Deducer works
best when used with the Java based R GUI JGR, but the dialogs can be
called from the command line. Dialogs have also been integrated into the
Windows Rgui.

The statistical methods and concepts covered by the dialogs is increasing,
and currently includes:

Data Manipulation: factor editing, Variable recoding, subseting, sorting,
merging, transposing, opening data (text and foreign), and saving data

Analysis: Frequencies, Descriptives, Contingency tables (and related
statistics), one-sample, two-sample, k-sample tests, as well as
correlations

Models: Linear Models (with optional HCCM), Logistic regression,
Generalized Linear Models

Since its initial release in August, there have been significant changes
to the back-end as well as the programmatic interface. This has resulted
in increased stability, and made for easier incorporation of Deducers R
functions into non-GUI programs. Additionally, a plug-in interface has
been added, which allows arbitrary packages to add onto Deducers menu
system.

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data frame/read.ftable

2009-12-03 Thread Robinson, David G

David,
Great! 'split' is something I didn't even look at.  Owe you one.
Many thanks,
Dave


On 12/2/09 7:29 PM, "David Winsemius"  wrote:



On Dec 2, 2009, at 7:02 PM, Robinson, David G wrote:

> My apologies for this question but I'm stuck and I'm sure that there
> must be
> an easy answer out there (and hope that someone will have mercy and
> point me
> in the right direction).
>
> I have a data file that looks like:
> 1 77 3
> 1 8 1
> 1 7 2
> 1 1 5
> 1 42 7
> 1 0 2
> 1 23 1
> 2 83 9
> 2 8 2
> 2 6 5
> 2 23 3
> 3 11 3
> 3 8 1
> .
>  etc.
> .
> N   3   2
>
>
> (FWIW, these are document, word reference, and word frequency
> counts.) I
> want to read the data into data frame, Doc, such that
> Doc[[1]]=
> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
> [1,]7787142023
> [2,]3125 72 1
>
> Doc[[2]]=
> [,1] [,2] [,3] [,4]
> [1,]838623
> [2,]9253
>
> Etc.

rd.txt <- function(txt, header=TRUE) {read.table(textConnection(txt),
header=header)}

 > dta <- rd.txt("1 77 3
+ 1 8 1
+ 1 7 2
+ 1 1 5
+ 1 42 7
+ 1 0 2
+ 1 23 1
+ 2 83 9
+ 2 8 2
+ 2 6 5
+ 2 23 3
+ 3 11 3
+ 3 8 1", header=F)
 > dta
V1 V2 V3
1   1 77  3
2   1  8  1
3   1  7  2
4   1  1  5
5   1 42  7
6   1  0  2
7   1 23  1
8   2 83  9
9   2  8  2
10  2  6  5
11  2 23  3
12  3 11  3
13  3  8  1


 > split(dta[ ,-1], list(dta[,1]))
$`1`
   V2 V3
1 77  3
2  8  1
3  7  2
4  1  5
5 42  7
6  0  2
7 23  1

$`2`
V2 V3
8  83  9
9   8  2
10  6  5
11 23  3

$`3`
V2 V3
12 11  3
13  8  1

 > ?split
 > lapply(split(dta[ ,-1], list(dta[,1])), t)
$`1`
 1 2 3 4  5 6  7
V2 77 8 7 1 42 0 23
V3  3 1 2 5  7 2  1

$`2`
 8 9 10 11
V2 83 8  6 23
V3  9 2  5  3

$`3`
12 13
V2 11  8
V3  3  1

>
>
> It seems like I should be able to do this using a flat contingency
> table
> method such as 'read.ftable' or possibly using 'stack' . However,
> something
> is not clicking and hence my plea for assistance.
>
> Thanks in advance,
> Dave Robinson
> dro...@sandia.gov
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data manipulation

2009-12-03 Thread Gabor Grothendieck

Try this where [0-9]+ matches one or more digits and $ matches the end of
string.  See http://gsubfn.googlecode.com for more.

library(gsubfn)
x <- c("v2FfaPre15", "v2FfaPre10", "v2FfaPre5", "v2Ffa2", "v2Ffa3",
"v2Ffa4")

strapply(x, "[0-9]+$", c, simplify = TRUE)


# or if you want a numeric result:
strapply(x, "[0-9]+$", as.numeric, simplify = TRUE)

On Thu, Dec 3, 2009 at 9:00 AM, oscar linares  wrote:

> Dear Wiza[R]ds,
>
> I have a data.frame header that looks like this:
>
> v2FfaPre15v2FfaPre10v2FfaPre5v2Ffa2v2Ffa3v2Ffa4
>
> I need it to look like this,
>
> 1510523 4
>
> i.e., with v2FfaPre and  v2Ffa stripped off
>
> Any suggestions,
>
> Thanks in advance!
>
> --
> Oscar
> Oscar A. Linares, MD
> Translational Medicine Unit
> LaPlaisance Bay, Bolles Harbor
> Monroe, Michigan 48161
>
> Department of Medicine,
> University of Toledo College of Medicine
> Toledo, OH 43606-3390
>
> Department of Internal Medicine,
> The Detroit Medical Center (DMC)
> Harper University Hospital
> Wayne State University School of Medicine
> Detroit, Michigan 48201
>
>[[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data manipulation

2009-12-03 Thread Henrique Dallazuanna

Try this:

gsub(".*[^0-9]", "", header)


On Thu, Dec 3, 2009 at 12:00 PM, oscar linares  wrote:
> Dear Wiza[R]ds,
>
> I have a data.frame header that looks like this:
>
> v2FfaPre15    v2FfaPre10    v2FfaPre5    v2Ffa2    v2Ffa3    v2Ffa4
>
> I need it to look like this,
>
> 15    10    5    2    3     4
>
> i.e., with v2FfaPre and  v2Ffa stripped off
>
> Any suggestions,
>
> Thanks in advance!
>
> --
> Oscar
> Oscar A. Linares, MD
> Translational Medicine Unit
> LaPlaisance Bay, Bolles Harbor
> Monroe, Michigan 48161
>
> Department of Medicine,
> University of Toledo College of Medicine
> Toledo, OH 43606-3390
>
> Department of Internal Medicine,
> The Detroit Medical Center (DMC)
> Harper University Hospital
> Wayne State University School of Medicine
> Detroit, Michigan 48201
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data manipulation

2009-12-03 Thread jim holtman

try this:

> x <- c('v2FfaPre15','v2FfaPre10','v2FfaPre5','v2Ffa2',
> 'v2Ffa3','v2Ffa4')
> sub("^.*?([0-9]+)$", "\\1", x, perl=TRUE)
[1] "15" "10" "5"  "2"  "3"  "4"
>


On Thu, Dec 3, 2009 at 9:00 AM, oscar linares  wrote:
> Dear Wiza[R]ds,
>
> I have a data.frame header that looks like this:
>
> v2FfaPre15    v2FfaPre10    v2FfaPre5    v2Ffa2    v2Ffa3    v2Ffa4
>
> I need it to look like this,
>
> 15    10    5    2    3     4
>
> i.e., with v2FfaPre and  v2Ffa stripped off
>
> Any suggestions,
>
> Thanks in advance!
>
> --
> Oscar
> Oscar A. Linares, MD
> Translational Medicine Unit
> LaPlaisance Bay, Bolles Harbor
> Monroe, Michigan 48161
>
> Department of Medicine,
> University of Toledo College of Medicine
> Toledo, OH 43606-3390
>
> Department of Internal Medicine,
> The Detroit Medical Center (DMC)
> Harper University Hospital
> Wayne State University School of Medicine
> Detroit, Michigan 48201
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data manipulation

2009-12-03 Thread oscar linares

Dear Wiza[R]ds,

I have a data.frame header that looks like this:

v2FfaPre15v2FfaPre10v2FfaPre5v2Ffa2v2Ffa3v2Ffa4

I need it to look like this,

1510523 4

i.e., with v2FfaPre and  v2Ffa stripped off

Any suggestions,

Thanks in advance!

-- 
Oscar
Oscar A. Linares, MD
Translational Medicine Unit
LaPlaisance Bay, Bolles Harbor
Monroe, Michigan 48161

Department of Medicine,
University of Toledo College of Medicine
Toledo, OH 43606-3390

Department of Internal Medicine,
The Detroit Medical Center (DMC)
Harper University Hospital
Wayne State University School of Medicine
Detroit, Michigan 48201

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data frame/read.ftable

2009-12-02 Thread David Winsemius

On Dec 2, 2009, at 7:02 PM, Robinson, David G wrote:

My apologies for this question but I¹m stuck and I¹m sure that there  
must be
an easy answer out there (and hope that someone will have mercy and  
point me

in the right direction).

I have a data file that looks like:
1 77 3
1 8 1
1 7 2
1 1 5
1 42 7
1 0 2
1 23 1
2 83 9
2 8 2
2 6 5
2 23 3
3 11 3
3 8 1
.
 etc.
.
N   3   2

(FWIW, these are document, word reference, and word frequency  
counts.) I

want to read the data into data frame, Doc, such that
Doc[[1]]=
[,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]7787142023
[2,]3125 72 1

Doc[[2]]=
[,1] [,2] [,3] [,4]
[1,]838623
[2,]9253

Etc.

rd.txt <- function(txt, header=TRUE) {read.table(textConnection(txt),  
header=header)}

> dta <- rd.txt("1 77 3
+ 1 8 1
+ 1 7 2
+ 1 1 5
+ 1 42 7
+ 1 0 2
+ 1 23 1
+ 2 83 9
+ 2 8 2
+ 2 6 5
+ 2 23 3
+ 3 11 3
+ 3 8 1", header=F)
> dta
   V1 V2 V3
1   1 77  3
2   1  8  1
3   1  7  2
4   1  1  5
5   1 42  7
6   1  0  2
7   1 23  1
8   2 83  9
9   2  8  2
10  2  6  5
11  2 23  3
12  3 11  3
13  3  8  1

> split(dta[ ,-1], list(dta[,1]))
$`1`
  V2 V3
1 77  3
2  8  1
3  7  2
4  1  5
5 42  7
6  0  2
7 23  1

$`2`
   V2 V3
8  83  9
9   8  2
10  6  5
11 23  3

$`3`
   V2 V3
12 11  3
13  8  1

> ?split
> lapply(split(dta[ ,-1], list(dta[,1])), t)
$`1`
1 2 3 4  5 6  7
V2 77 8 7 1 42 0 23
V3  3 1 2 5  7 2  1

$`2`
8 9 10 11
V2 83 8  6 23
V3  9 2  5  3

$`3`
   12 13
V2 11  8
V3  3  1

It seems like I should be able to do this using a flat contingency  
table
method such as 'read.ftable' or possibly using 'stack' . However,  
something

is not clicking and hence my plea for assistance.

Thanks in advance,
Dave Robinson
dro...@sandia.gov

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data frame/read.ftable

2009-12-02 Thread Robinson, David G

My apologies for this question but I¹m stuck and I¹m sure that there must be
an easy answer out there (and hope that someone will have mercy and point me
in the right direction).

I have a data file that looks like:
1 77 3
1 8 1
1 7 2
1 1 5
1 42 7
1 0 2
1 23 1
2 83 9
2 8 2
2 6 5
2 23 3
3 11 3
3 8 1
.
 etc.
.
N   3   2


(FWIW, these are document, word reference, and word frequency counts.) I
want to read the data into data frame, Doc, such that
Doc[[1]]=
 [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]7787142023
[2,]3125 72 1

Doc[[2]]=
 [,1] [,2] [,3] [,4]
[1,]838623
[2,]9253

Etc. 


It seems like I should be able to do this using a flat contingency table
method such as 'read.ftable' or possibly using 'stack' . However, something
is not clicking and hence my plea for assistance.

Thanks in advance, 
Dave Robinson
dro...@sandia.gov

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data linkage functions for probabilistic linkage using person identifiers

2009-11-18 Thread Doran, Harold

Interesting enough, I just posted a package to CRAN with a function that might 
be useful. It is called MiscPsycho and is for psychometric work. The updated 
version of the package should be available in a day or so. It has a function 
called stringMatch which just implements the Levenshtein distance or a 
normalized version of the distance (what I call the LND). Then, there is a 
function called stringProbs which gives the probability of observing a given 
LND.

In education, we merge data sets all the time using a unique ID. It turns out, 
however, that the unique ID is not so unique. It is often shared by many kids 
over time, duplicated within a year, etc. So, we need to first merge using the 
ID and then validate that we have merged properly using some other mechanism. I 
think the LND is very useful for this purpose.

So, here is an example of the function in this package:

### A perfect match gives an LND of 1
> stringMatch('William Clinton', 'William Clinton', normalize='YES')
[1] 1

### A close match gives an LND less than 1
> stringMatch('William Clinton', 'Bill Clinton', normalize='YES')
[1] 0.733

If your database is small, you can actually look at the records and see if 
values less than 1 are really the same name spelled differently, misspelled, 
etc.

But, if your data set has hundreds of thousands of records that becomes 
impossible. So, what I do is compute the probability that you would observe an 
LND of .7 or higher. This is implemented in the stringProbs function. Let's say 
the probability of observing an LND of .7 is .05 and lower values are even 
higher. Assuming you are willing to live with this much risk, you might then 
subset your data and retain records as "valid merges" only if the LND value is 
greater than .7.

The record linking literature is very big, but it is extremely small in 
education. So, I have a paper in press demonstrating this application and 
comparing it to other linking methods, like use of Soundex codes. In the paper, 
I also discuss how you would combine other demographic information, such as 
birthdates, etc to further explore probabilities of a correct match.

Harold

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of David Winsemius
Sent: Wednesday, November 18, 2009 4:32 PM
To: Dagan A WRIGHT
Cc: r-help@r-project.org
Subject: Re: [R] Data linkage functions for probabilistic linkage using person 
identifiers

On Nov 18, 2009, at 1:21 PM, Dagan A WRIGHT wrote:

> I am somewhat new to R although using and liking already.  I am  
> curious if there are any probabilistic packages similar in function  
> to others such and Link King (http://www.the-link-king.com/).  I am  
> looking for functions in SSN, First/Last name, date of birth, and a  
> couple other indicators for matching.
>

Cannot comment on similarities to Link King but have used the  
functions found with this search in similar applications:

RSiteSearch("Levenshtein")  #yes, that is spelled correctly

> Thanks
>
> Dagan Wright, Ph.D., M.S.P.H.
> Lead Addictions Research Analyst, Analysis & Evaluation Unit
> Addictions & Mental Health Division (AMH)
> 500 Summer St. NE E86
> Salem, Oregon 97301-1118
>
> Office number: 503-945-5726
> Fax number: 503-378-8467
> dagan.a.wri...@state.or.us
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data linkage functions for probabilistic linkage using person identifiers

2009-11-18 Thread David Winsemius



On Nov 18, 2009, at 1:21 PM, Dagan A WRIGHT wrote:

I am somewhat new to R although using and liking already.  I am  
curious if there are any probabilistic packages similar in function  
to others such and Link King (http://www.the-link-king.com/).  I am  
looking for functions in SSN, First/Last name, date of birth, and a  
couple other indicators for matching.




Cannot comment on similarities to Link King but have used the  
functions found with this search in similar applications:


RSiteSearch("Levenshtein")  #yes, that is spelled correctly



Thanks

Dagan Wright, Ph.D., M.S.P.H.
Lead Addictions Research Analyst, Analysis & Evaluation Unit
Addictions & Mental Health Division (AMH)
500 Summer St. NE E86
Salem, Oregon 97301-1118

Office number: 503-945-5726
Fax number: 503-378-8467
dagan.a.wri...@state.or.us

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data linkage functions for probabilistic linkage using person identifiers

2009-11-18 Thread Dagan A WRIGHT

I am somewhat new to R although using and liking already.  I am curious if 
there are any probabilistic packages similar in function to others such and 
Link King (http://www.the-link-king.com/).  I am looking for functions in SSN, 
First/Last name, date of birth, and a couple other indicators for matching.

Thanks

Dagan Wright, Ph.D., M.S.P.H.
Lead Addictions Research Analyst, Analysis & Evaluation Unit
Addictions & Mental Health Division (AMH)
500 Summer St. NE E86
Salem, Oregon 97301-1118

Office number: 503-945-5726
Fax number: 503-378-8467
dagan.a.wri...@state.or.us

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data source name not found and no default driver specified

2009-11-16 Thread helpme

I forgot to mention that it's running Windows Server 2003 x64 OS version

On Mon, Nov 16, 2009 at 11:22 AM, helpme  wrote:

> I'm stumped. When trying to connect to Oracle using the RODBC package I get
> an error:
> *[RODBC] Data source name not found and no default driver specified.
> ODBC connect failed.*
>
> I've read over all the posts and documentation manuals.
> The system is Windows Server 2003 with R 2.81. and the latest downloadable
> RODBC package. The Oracle SID/DSN is mfopdw. I made sure to add it to
> Control Panel->Administrative Priviledges->Microsoft ODBC system/user DNS.
>
> I've also tried the following in no particular order:
>
> 1.) Turn on all oracle services in control panel->administrative
> priviledges.
> 2.) Checked tsnnames.ora for SID.
> 3.) Add microsoft ODBC service to Control Panel services for SID
> 4.) Use Sqldeveler to test connection another way besides R (It was
> successful)
> 5.) channel<-odbcDriverConnect(connection="Driver={Microsoft ODBC for
> Oracle}; DSN=abc,UID=abc;PWD=abc;"case="oracle")
>
> received error drivers SQLAllocHandle on SQL_HANDLE_ENV failed one time;
> another time I got the error that Oracle client and networking components
> 7.3 or greater is not found.
>
> 6.) tnsping mfopdw
>
> lsnrctl start mfopdw
>
> tried to add oracle/bin to path
>
> Nothing is working.
>
>
> Please advise.
>
> Thank you,
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data source name not found and no default driver specified

2009-11-16 Thread helpme

I'm stumped. When trying to connect to Oracle using the RODBC package I get
an error:
*[RODBC] Data source name not found and no default driver specified.
ODBC connect failed.*

I've read over all the posts and documentation manuals.
The system is Windows Server 2003 with R 2.81. and the latest downloadable
RODBC package. The Oracle SID/DSN is mfopdw. I made sure to add it to
Control Panel->Administrative Priviledges->Microsoft ODBC system/user DNS.

I've also tried the following in no particular order:

1.) Turn on all oracle services in control panel->administrative
priviledges.
2.) Checked tsnnames.ora for SID.
3.) Add microsoft ODBC service to Control Panel services for SID
4.) Use Sqldeveler to test connection another way besides R (It was
successful)
5.) channel<-odbcDriverConnect(connection="Driver={Microsoft ODBC for
Oracle}; DSN=abc,UID=abc;PWD=abc;"case="oracle")

received error drivers SQLAllocHandle on SQL_HANDLE_ENV failed one time;
another time I got the error that Oracle client and networking components
7.3 or greater is not found.

6.) tnsping mfopdw

lsnrctl start mfopdw

tried to add oracle/bin to path

Nothing is working.


Please advise.

Thank you,

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame subsets?

2009-11-13 Thread David Winsemius



On Nov 12, 2009, at 10:57 PM, Douglas M. Hultstrand wrote:


Hello,

I am trying to create data frame subsets based on binned temperature  
data.  I have code working to create the bins (d.1 and d.2), but it  
takes two steps, I was wondering if I could merge into one step.   
See Below


d
  n year mo da hr   t   td  tw rh   kPa
1   1 1945  3  1  0 1.1  0.0 0.6 92 101.7
2   2 1945  3  1  1 2.8 -1.1 1.1 76 101.8
3   3 1945  3  1  2 2.2 -1.7 0.6 75 101.9
4   4 1945  3  1  3 1.7 -1.1 0.6 82 102.0
5   5 1945  3  1  4 1.7 -2.8 0.0 72 102.1
6   6 1945  3  1  5 1.7 -1.7 0.0 78 102.2
7   7 1945  3  1  6 1.1 -2.8 0.0 75 102.2
8   8 1945  3  1  7 1.1 -1.7 0.0 82 102.4
9   9 1945  3  1  8 1.7 -1.1 0.6 82 102.5
10 10 1945  3  1  9 2.8 -3.3 0.6 64 102.6

d.1 <- d[d$t >= 1.0,]
d.2 <- d.1[d.1$t < 2.0,]

How can I make  d.1 and d.2 into one step? I have tried several  
different methods such as example below.

d.1 <- d[d$t >= 1.0, && d$t < 2.0,]


Instead of using "&&" , use "&". It is the vectorized version of the  
AND operation. "&&" only operated on the first two elements and  
returned a scalar rather than the vector you needed.


--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame subsets?

2009-11-12 Thread Jorge Ivan Velez

Hi Douglas,

Here is a suggestion:

subset(d, t >= 1 & t < 2)

See ?subset for more information.

HTH,
Jorge


On Thu, Nov 12, 2009 at 10:57 PM, Douglas M. Hultstrand <> wrote:

> Hello,
>
> I am trying to create data frame subsets based on binned temperature data.
>  I have code working to create the bins (d.1 and d.2), but it takes two
> steps, I was wondering if I could merge into one step.  See Below
>
> d
>   n year mo da hr   t   td  tw rh   kPa
> 1   1 1945  3  1  0 1.1  0.0 0.6 92 101.7
> 2   2 1945  3  1  1 2.8 -1.1 1.1 76 101.8
> 3   3 1945  3  1  2 2.2 -1.7 0.6 75 101.9
> 4   4 1945  3  1  3 1.7 -1.1 0.6 82 102.0
> 5   5 1945  3  1  4 1.7 -2.8 0.0 72 102.1
> 6   6 1945  3  1  5 1.7 -1.7 0.0 78 102.2
> 7   7 1945  3  1  6 1.1 -2.8 0.0 75 102.2
> 8   8 1945  3  1  7 1.1 -1.7 0.0 82 102.4
> 9   9 1945  3  1  8 1.7 -1.1 0.6 82 102.5
> 10 10 1945  3  1  9 2.8 -3.3 0.6 64 102.6
>
> d.1 <- d[d$t >= 1.0,]
> d.2 <- d.1[d.1$t < 2.0,]
>
> How can I make  d.1 and d.2 into one step? I have tried several different
> methods such as example below.
> d.1 <- d[d$t >= 1.0, && d$t < 2.0,]
>
> Thanks,
> Doug
>
> --
> -
> Douglas M. Hultstrand, MS
> Senior Hydrometeorologist
> Metstat, Inc. Windsor, Colorado
> voice: 970.686.1253
> email: dmhul...@metstat.com
> web: http://www.metstat.com
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data frame subsets?

2009-11-12 Thread Douglas M. Hultstrand


Hello,

I am trying to create data frame subsets based on binned temperature 
data.  I have code working to create the bins (d.1 and d.2), but it 
takes two steps, I was wondering if I could merge into one step.  See Below


d
   n year mo da hr   t   td  tw rh   kPa
1   1 1945  3  1  0 1.1  0.0 0.6 92 101.7
2   2 1945  3  1  1 2.8 -1.1 1.1 76 101.8
3   3 1945  3  1  2 2.2 -1.7 0.6 75 101.9
4   4 1945  3  1  3 1.7 -1.1 0.6 82 102.0
5   5 1945  3  1  4 1.7 -2.8 0.0 72 102.1
6   6 1945  3  1  5 1.7 -1.7 0.0 78 102.2
7   7 1945  3  1  6 1.1 -2.8 0.0 75 102.2
8   8 1945  3  1  7 1.1 -1.7 0.0 82 102.4
9   9 1945  3  1  8 1.7 -1.1 0.6 82 102.5
10 10 1945  3  1  9 2.8 -3.3 0.6 64 102.6

d.1 <- d[d$t >= 1.0,]
d.2 <- d.1[d.1$t < 2.0,]

How can I make  d.1 and d.2 into one step? I have tried several 
different methods such as example below.

d.1 <- d[d$t >= 1.0, && d$t < 2.0,]

Thanks,
Doug

--
-
Douglas M. Hultstrand, MS
Senior Hydrometeorologist
Metstat, Inc. Windsor, Colorado
voice: 970.686.1253
email: dmhul...@metstat.com
web: http://www.metstat.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2009-11-11 Thread hadley wickham

>> (x.n <- cast(x.m, id ~ var, function(.dat){
> +     if (length(.dat) == 0) return(0)  # test for no data; return
> zero if that is the case
> +     mean(.dat)
> + }))

Or fill = 0.

Hadley


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2009-11-11 Thread legen


That's what I want. Many thanks for your help.
Legen



jholtman wrote:
> 
> Try this:
> 
>> x <- read.table(textConnection("idcode1code2 p
> +  148   0.1
> +  157   0.9
> +  218   0.4
> +  262   0.2
> +  243   0.6
> +  356   0.7
> +  375   0.9"), header=TRUE)
>>  closeAllConnections()
>>  # create object like output from 'melt'
>>  x.m <- data.frame(id=c(x$id, x$id),
> +var=paste('var', c(x$code1, x$code2), sep=''),
> +variable=rep('p', 2*nrow(x)),
> +value=c(x$p, x$p))
>> require(reshape)  # use the reshape package
>> (x.n <- cast(x.m, id ~ var, function(.dat){
> + if (length(.dat) == 0) return(0)  # test for no data; return
> zero if that is the case
> + mean(.dat)
> + }))
>   id var1 var2 var3 var4 var5 var6 var7 var8
> 1  1  0.0  0.0  0.0  0.1  0.9  0.0  0.9  0.1
> 2  2  0.4  0.2  0.6  0.6  0.0  0.2  0.0  0.4
> 3  3  0.0  0.0  0.0  0.0  0.8  0.7  0.9  0.0
>>
> 
> 
> On Tue, Nov 10, 2009 at 11:10 PM, legen  wrote:
>>
>> Thank you for your kind help. Your script works very well. Would you
>> please
>> show me how to change NaN to zero and column variables 1, 2, ..., 8 to
>> var1,
>> var2, ..., var8? Thanks again.
>>
>> Legen
>>
>>
>>
>> jholtman wrote:
>>>
>>> Is this what you want:
>>>
 x <- read.table(textConnection("id    code1    code2         p
>>> +  1        4        8           0.1
>>> +  1        5        7           0.9
>>> +  2        1        8           0.4
>>> +  2        6        2           0.2
>>> +  2        4        3           0.6
>>> +  3        5        6           0.7
>>> +  3        7        5           0.9"), header=TRUE)
  closeAllConnections()
  # create object like output from 'melt'
  x.m <- data.frame(id=c(x$id, x$id), var=c(x$code1, x$code2),
>>> +     variable=rep('p', 2*nrow(x)), value=c(x$p, x$p))
 require(reshape)  # use the reshape package
 cast(x.m, id ~ var, mean)
>>>   id   1   2   3   4   5   6   7   8
>>> 1  1 NaN NaN NaN 0.1 0.9 NaN 0.9 0.1
>>> 2  2 0.4 0.2 0.6 0.6 NaN 0.2 NaN 0.4
>>> 3  3 NaN NaN NaN NaN 0.8 0.7 0.9 NaN

>>>
>>>
>>>
>>> On Tue, Nov 10, 2009 at 4:30 PM, legen  wrote:

 Dear all,

 I have a dataset as below:

 id    code1    code2         p
  1        4        8           0.1
  1        5        7           0.9
  2        1        8           0.4
  2        6        2           0.2
  2        4        3           0.6
  3        5        6           0.7
  3        7        5           0.9

 I just want to rewrite it as this (vertical to horizontal):

 id   var1  var2  var3  var4  var5  var6  var7  var8
 1        0      0      0    0.1   0.9       0   0.9    0.1
 2     0.4    0.2   0.6    0.6      0    0.2      0    0.4
 3        0      0      0      0    0.8    0.7    0.9      0

 For the third subject, there are two values being equal to 5 in code1
 and
 code2, but different values in p:  0.7 and 0.9, so I assigned their
 average
 0.8 in var5.

 Does anybody can help me to handle this? Many thanks for your
 consideration
 and time.

 Legen

 --
 View this message in context:
 http://old.nabble.com/Data-transformation-tp26291568p26291568.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>
>>>
>>>
>>> --
>>> Jim Holtman
>>> Cincinnati, OH
>>> +1 513 646 9390
>>>
>>> What is the problem that you are trying to solve?
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Data-transformation-tp26291568p26295766.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem that you are trying to solve?
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/lis

Re: [R] Data transformation

2009-11-11 Thread legen


Your script works very well. Thank you very much.

Legen



Henrique Dallazuanna wrote:
> 
> Try this also:
> 
> xtabs(rep(p, 2) ~ rep(id, 2) + sprintf("var%d", c(code1, code2)), data =
> x)
> 
> On Wed, Nov 11, 2009 at 2:10 AM, legen  wrote:
>>
>> Thank you for your kind help. Your script works very well. Would you
>> please
>> show me how to change NaN to zero and column variables 1, 2, ..., 8 to
>> var1,
>> var2, ..., var8? Thanks again.
>>
>> Legen
>>
>>
>>
>> jholtman wrote:
>>>
>>> Is this what you want:
>>>
 x <- read.table(textConnection("id    code1    code2         p
>>> +  1        4        8           0.1
>>> +  1        5        7           0.9
>>> +  2        1        8           0.4
>>> +  2        6        2           0.2
>>> +  2        4        3           0.6
>>> +  3        5        6           0.7
>>> +  3        7        5           0.9"), header=TRUE)
  closeAllConnections()
  # create object like output from 'melt'
  x.m <- data.frame(id=c(x$id, x$id), var=c(x$code1, x$code2),
>>> +     variable=rep('p', 2*nrow(x)), value=c(x$p, x$p))
 require(reshape)  # use the reshape package
 cast(x.m, id ~ var, mean)
>>>   id   1   2   3   4   5   6   7   8
>>> 1  1 NaN NaN NaN 0.1 0.9 NaN 0.9 0.1
>>> 2  2 0.4 0.2 0.6 0.6 NaN 0.2 NaN 0.4
>>> 3  3 NaN NaN NaN NaN 0.8 0.7 0.9 NaN

>>>
>>>
>>>
>>> On Tue, Nov 10, 2009 at 4:30 PM, legen  wrote:

 Dear all,

 I have a dataset as below:

 id    code1    code2         p
  1        4        8           0.1
  1        5        7           0.9
  2        1        8           0.4
  2        6        2           0.2
  2        4        3           0.6
  3        5        6           0.7
  3        7        5           0.9

 I just want to rewrite it as this (vertical to horizontal):

 id   var1  var2  var3  var4  var5  var6  var7  var8
 1        0      0      0    0.1   0.9       0   0.9    0.1
 2     0.4    0.2   0.6    0.6      0    0.2      0    0.4
 3        0      0      0      0    0.8    0.7    0.9      0

 For the third subject, there are two values being equal to 5 in code1
 and
 code2, but different values in p:  0.7 and 0.9, so I assigned their
 average
 0.8 in var5.

 Does anybody can help me to handle this? Many thanks for your
 consideration
 and time.

 Legen

 --
 View this message in context:
 http://old.nabble.com/Data-transformation-tp26291568p26291568.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

>>>
>>>
>>>
>>> --
>>> Jim Holtman
>>> Cincinnati, OH
>>> +1 513 646 9390
>>>
>>> What is the problem that you are trying to solve?
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Data-transformation-tp26291568p26295766.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> Henrique Dallazuanna
> Curitiba-Paraná-Brasil
> 25° 25' 40" S 49° 16' 22" O
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Data-transformation-tp26291568p26301029.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2009-11-11 Thread jim holtman

Try this:

> x <- read.table(textConnection("idcode1code2 p
+  148   0.1
+  157   0.9
+  218   0.4
+  262   0.2
+  243   0.6
+  356   0.7
+  375   0.9"), header=TRUE)
>  closeAllConnections()
>  # create object like output from 'melt'
>  x.m <- data.frame(id=c(x$id, x$id),
+var=paste('var', c(x$code1, x$code2), sep=''),
+variable=rep('p', 2*nrow(x)),
+value=c(x$p, x$p))
> require(reshape)  # use the reshape package
> (x.n <- cast(x.m, id ~ var, function(.dat){
+ if (length(.dat) == 0) return(0)  # test for no data; return
zero if that is the case
+ mean(.dat)
+ }))
  id var1 var2 var3 var4 var5 var6 var7 var8
1  1  0.0  0.0  0.0  0.1  0.9  0.0  0.9  0.1
2  2  0.4  0.2  0.6  0.6  0.0  0.2  0.0  0.4
3  3  0.0  0.0  0.0  0.0  0.8  0.7  0.9  0.0
>


On Tue, Nov 10, 2009 at 11:10 PM, legen  wrote:
>
> Thank you for your kind help. Your script works very well. Would you please
> show me how to change NaN to zero and column variables 1, 2, ..., 8 to var1,
> var2, ..., var8? Thanks again.
>
> Legen
>
>
>
> jholtman wrote:
>>
>> Is this what you want:
>>
>>> x <- read.table(textConnection("id    code1    code2         p
>> +  1        4        8           0.1
>> +  1        5        7           0.9
>> +  2        1        8           0.4
>> +  2        6        2           0.2
>> +  2        4        3           0.6
>> +  3        5        6           0.7
>> +  3        7        5           0.9"), header=TRUE)
>>>  closeAllConnections()
>>>  # create object like output from 'melt'
>>>  x.m <- data.frame(id=c(x$id, x$id), var=c(x$code1, x$code2),
>> +     variable=rep('p', 2*nrow(x)), value=c(x$p, x$p))
>>> require(reshape)  # use the reshape package
>>> cast(x.m, id ~ var, mean)
>>   id   1   2   3   4   5   6   7   8
>> 1  1 NaN NaN NaN 0.1 0.9 NaN 0.9 0.1
>> 2  2 0.4 0.2 0.6 0.6 NaN 0.2 NaN 0.4
>> 3  3 NaN NaN NaN NaN 0.8 0.7 0.9 NaN
>>>
>>
>>
>>
>> On Tue, Nov 10, 2009 at 4:30 PM, legen  wrote:
>>>
>>> Dear all,
>>>
>>> I have a dataset as below:
>>>
>>> id    code1    code2         p
>>>  1        4        8           0.1
>>>  1        5        7           0.9
>>>  2        1        8           0.4
>>>  2        6        2           0.2
>>>  2        4        3           0.6
>>>  3        5        6           0.7
>>>  3        7        5           0.9
>>>
>>> I just want to rewrite it as this (vertical to horizontal):
>>>
>>> id   var1  var2  var3  var4  var5  var6  var7  var8
>>> 1        0      0      0    0.1   0.9       0   0.9    0.1
>>> 2     0.4    0.2   0.6    0.6      0    0.2      0    0.4
>>> 3        0      0      0      0    0.8    0.7    0.9      0
>>>
>>> For the third subject, there are two values being equal to 5 in code1 and
>>> code2, but different values in p:  0.7 and 0.9, so I assigned their
>>> average
>>> 0.8 in var5.
>>>
>>> Does anybody can help me to handle this? Many thanks for your
>>> consideration
>>> and time.
>>>
>>> Legen
>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Data-transformation-tp26291568p26291568.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/Data-transformation-tp26291568p26295766.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2009-11-11 Thread Henrique Dallazuanna

Try this also:

xtabs(rep(p, 2) ~ rep(id, 2) + sprintf("var%d", c(code1, code2)), data = x)

On Wed, Nov 11, 2009 at 2:10 AM, legen  wrote:
>
> Thank you for your kind help. Your script works very well. Would you please
> show me how to change NaN to zero and column variables 1, 2, ..., 8 to var1,
> var2, ..., var8? Thanks again.
>
> Legen
>
>
>
> jholtman wrote:
>>
>> Is this what you want:
>>
>>> x <- read.table(textConnection("id    code1    code2         p
>> +  1        4        8           0.1
>> +  1        5        7           0.9
>> +  2        1        8           0.4
>> +  2        6        2           0.2
>> +  2        4        3           0.6
>> +  3        5        6           0.7
>> +  3        7        5           0.9"), header=TRUE)
>>>  closeAllConnections()
>>>  # create object like output from 'melt'
>>>  x.m <- data.frame(id=c(x$id, x$id), var=c(x$code1, x$code2),
>> +     variable=rep('p', 2*nrow(x)), value=c(x$p, x$p))
>>> require(reshape)  # use the reshape package
>>> cast(x.m, id ~ var, mean)
>>   id   1   2   3   4   5   6   7   8
>> 1  1 NaN NaN NaN 0.1 0.9 NaN 0.9 0.1
>> 2  2 0.4 0.2 0.6 0.6 NaN 0.2 NaN 0.4
>> 3  3 NaN NaN NaN NaN 0.8 0.7 0.9 NaN
>>>
>>
>>
>>
>> On Tue, Nov 10, 2009 at 4:30 PM, legen  wrote:
>>>
>>> Dear all,
>>>
>>> I have a dataset as below:
>>>
>>> id    code1    code2         p
>>>  1        4        8           0.1
>>>  1        5        7           0.9
>>>  2        1        8           0.4
>>>  2        6        2           0.2
>>>  2        4        3           0.6
>>>  3        5        6           0.7
>>>  3        7        5           0.9
>>>
>>> I just want to rewrite it as this (vertical to horizontal):
>>>
>>> id   var1  var2  var3  var4  var5  var6  var7  var8
>>> 1        0      0      0    0.1   0.9       0   0.9    0.1
>>> 2     0.4    0.2   0.6    0.6      0    0.2      0    0.4
>>> 3        0      0      0      0    0.8    0.7    0.9      0
>>>
>>> For the third subject, there are two values being equal to 5 in code1 and
>>> code2, but different values in p:  0.7 and 0.9, so I assigned their
>>> average
>>> 0.8 in var5.
>>>
>>> Does anybody can help me to handle this? Many thanks for your
>>> consideration
>>> and time.
>>>
>>> Legen
>>>
>>> --
>>> View this message in context:
>>> http://old.nabble.com/Data-transformation-tp26291568p26291568.html
>>> Sent from the R help mailing list archive at Nabble.com.
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> --
> View this message in context: 
> http://old.nabble.com/Data-transformation-tp26291568p26295766.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2009-11-10 Thread legen


Thank you for your kind help. Your script works very well. Would you please
show me how to change NaN to zero and column variables 1, 2, ..., 8 to var1,
var2, ..., var8? Thanks again.

Legen

 

jholtman wrote:
> 
> Is this what you want:
> 
>> x <- read.table(textConnection("idcode1code2 p
> +  148   0.1
> +  157   0.9
> +  218   0.4
> +  262   0.2
> +  243   0.6
> +  356   0.7
> +  375   0.9"), header=TRUE)
>>  closeAllConnections()
>>  # create object like output from 'melt'
>>  x.m <- data.frame(id=c(x$id, x$id), var=c(x$code1, x$code2),
> + variable=rep('p', 2*nrow(x)), value=c(x$p, x$p))
>> require(reshape)  # use the reshape package
>> cast(x.m, id ~ var, mean)
>   id   1   2   3   4   5   6   7   8
> 1  1 NaN NaN NaN 0.1 0.9 NaN 0.9 0.1
> 2  2 0.4 0.2 0.6 0.6 NaN 0.2 NaN 0.4
> 3  3 NaN NaN NaN NaN 0.8 0.7 0.9 NaN
>>
> 
> 
> 
> On Tue, Nov 10, 2009 at 4:30 PM, legen  wrote:
>>
>> Dear all,
>>
>> I have a dataset as below:
>>
>> id    code1    code2         p
>>  1        4        8           0.1
>>  1        5        7           0.9
>>  2        1        8           0.4
>>  2        6        2           0.2
>>  2        4        3           0.6
>>  3        5        6           0.7
>>  3        7        5           0.9
>>
>> I just want to rewrite it as this (vertical to horizontal):
>>
>> id   var1  var2  var3  var4  var5  var6  var7  var8
>> 1        0      0      0    0.1   0.9       0   0.9    0.1
>> 2     0.4    0.2   0.6    0.6      0    0.2      0    0.4
>> 3        0      0      0      0    0.8    0.7    0.9      0
>>
>> For the third subject, there are two values being equal to 5 in code1 and
>> code2, but different values in p:  0.7 and 0.9, so I assigned their
>> average
>> 0.8 in var5.
>>
>> Does anybody can help me to handle this? Many thanks for your
>> consideration
>> and time.
>>
>> Legen
>>
>> --
>> View this message in context:
>> http://old.nabble.com/Data-transformation-tp26291568p26291568.html
>> Sent from the R help mailing list archive at Nabble.com.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 
> 
> 
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
> 
> What is the problem that you are trying to solve?
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Data-transformation-tp26291568p26295766.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data transformation

2009-11-10 Thread jim holtman

Is this what you want:

> x <- read.table(textConnection("idcode1code2 p
+  148   0.1
+  157   0.9
+  218   0.4
+  262   0.2
+  243   0.6
+  356   0.7
+  375   0.9"), header=TRUE)
>  closeAllConnections()
>  # create object like output from 'melt'
>  x.m <- data.frame(id=c(x$id, x$id), var=c(x$code1, x$code2),
+ variable=rep('p', 2*nrow(x)), value=c(x$p, x$p))
> require(reshape)  # use the reshape package
> cast(x.m, id ~ var, mean)
  id   1   2   3   4   5   6   7   8
1  1 NaN NaN NaN 0.1 0.9 NaN 0.9 0.1
2  2 0.4 0.2 0.6 0.6 NaN 0.2 NaN 0.4
3  3 NaN NaN NaN NaN 0.8 0.7 0.9 NaN
>



On Tue, Nov 10, 2009 at 4:30 PM, legen  wrote:
>
> Dear all,
>
> I have a dataset as below:
>
> id    code1    code2         p
>  1        4        8           0.1
>  1        5        7           0.9
>  2        1        8           0.4
>  2        6        2           0.2
>  2        4        3           0.6
>  3        5        6           0.7
>  3        7        5           0.9
>
> I just want to rewrite it as this (vertical to horizontal):
>
> id   var1  var2  var3  var4  var5  var6  var7  var8
> 1        0      0      0    0.1   0.9       0   0.9    0.1
> 2     0.4    0.2   0.6    0.6      0    0.2      0    0.4
> 3        0      0      0      0    0.8    0.7    0.9      0
>
> For the third subject, there are two values being equal to 5 in code1 and
> code2, but different values in p:  0.7 and 0.9, so I assigned their average
> 0.8 in var5.
>
> Does anybody can help me to handle this? Many thanks for your consideration
> and time.
>
> Legen
>
> --
> View this message in context: 
> http://old.nabble.com/Data-transformation-tp26291568p26291568.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data transformation

2009-11-10 Thread legen


Dear all,

I have a dataset as below:

idcode1code2 p 
 148   0.1
 157   0.9
 218   0.4
 262   0.2
 243   0.6
 356   0.7
 375   0.9

I just want to rewrite it as this (vertical to horizontal):

id   var1  var2  var3  var4  var5  var6  var7  var8 
10  0  00.1   0.9   0   0.90.1
2 0.40.2   0.60.6  00.2  00.4
30  0  0  00.80.70.9  0

For the third subject, there are two values being equal to 5 in code1 and
code2, but different values in p:  0.7 and 0.9, so I assigned their average
0.8 in var5.

Does anybody can help me to handle this? Many thanks for your consideration
and time.

Legen 

-- 
View this message in context: 
http://old.nabble.com/Data-transformation-tp26291568p26291568.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Partition Package

2009-10-28 Thread Xu

Thanks a lot. Have a nice day!

Best,

Pat

On Wed, Oct 28, 2009 at 10:29 AM, Max Kuhn  wrote:

> There are a few. I'm partial to the function in the caret package:
> createDataPartition. Also, there are functions there for
> pre-processing on training sets and applying it to new data sets.
>
> For a somewhat dated summary of the packages, see:
>
>   http://www.jstatsoft.org/v28/i05
>
> also:
>
>
> http://caret.r-forge.r-project.org/Classification_and_Regression_Training.html
>
> Max
>
>
>
>
> On Wed, Oct 28, 2009 at 11:06 AM, Xu  wrote:
> > Hi, Users,
> >
> >  I am a new user. I am trying to partition data into training and test.
> Is
> > there any R package or function that can partition dataset? Also, is
> there
> > any package do crossvalidation? Any help will be appreciated.
> >
> > Best,
> >
> > Pat
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
>
> --
>
> Max
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data Partition Package

2009-10-28 Thread Max Kuhn

There are a few. I'm partial to the function in the caret package:
createDataPartition. Also, there are functions there for
pre-processing on training sets and applying it to new data sets.

For a somewhat dated summary of the packages, see:

   http://www.jstatsoft.org/v28/i05

also:

http://caret.r-forge.r-project.org/Classification_and_Regression_Training.html

Max

On Wed, Oct 28, 2009 at 11:06 AM, Xu  wrote:
> Hi, Users,
>
>  I am a new user. I am trying to partition data into training and test. Is
> there any R package or function that can partition dataset? Also, is there
> any package do crossvalidation? Any help will be appreciated.
>
> Best,
>
> Pat
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 

Max

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data Partition Package

2009-10-28 Thread Xu

Hi, Users,

  I am a new user. I am trying to partition data into training and test. Is
there any R package or function that can partition dataset? Also, is there
any package do crossvalidation? Any help will be appreciated.

Best,

Pat

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame is killing me! help

2009-10-26 Thread bbslover


Thank you ,Petr
It is a good answer,clearly.

thanks! 

Petr Pikal wrote:
> 
> Hi
> 
>> data(gasoline)
>> str(gasoline)
> 'data.frame':   60 obs. of  2 variables:
>  $ octane: num  85.3 85.2 88.5 83.4 87.9 ...
>  $ NIR   : AsIs [1:60, 1:401] -0.050193 -0.044227 -0.046867 -0.046705 
> -0.050859 ...
>   ..- attr(*, "dimnames")=List of 2
>   .. ..$ : chr  "1" "2" "3" "4" ...
>   .. ..$ : chr  "900 nm" "902 nm" "904 nm" "906 nm" ...
>> str(gasoline$NIR)
>  AsIs [1:60, 1:401] -0.050193 -0.044227 -0.046867 -0.046705 -0.050859 ...
>  - attr(*, "dimnames")=List of 2
>   ..$ : chr [1:60] "1" "2" "3" "4" ...
>   ..$ : chr [1:401] "900 nm" "902 nm" "904 nm" "906 nm" ...
>> is.matrix(gasoline$NIR)
> [1] TRUE
> 
> so the second element of gasoline data frame is a matrix
> 
>> ?AsIs
> 
>> df<-data.frame(x=1:5, I(matrix(rnorm(10), 5,2)))
>> df
>   x matrix.rnorm.10...5..2..1 matrix.rnorm.10...5..2..2
> 1 1  0.187703  0.213312
> 2 2  -0.66264  -0.47941
> 3 3  -0.82334  -0.04324
> 4 4  -0.37255  0.883027
> 5 5  -0.28700  -1.03431
>> str(df)
> 'data.frame':   5 obs. of  2 variables:
>  $ x  : int  1 2 3 4 5
>  $ matrix.rnorm.10...5..2.: AsIs [1:5, 1:2] 0.187703 -0.66264 
> -0.82334 -0.37255 -0.28700 ...
>> 
> 
> Regards
> Petr
> 
> r-help-boun...@r-project.org napsal dne 23.10.2009 18:43:56:
> 
>> 
>> I have read that one ,I want to this method to be used to my data.but I 
> donot
>> know how to put my data into R. 
>> 
>> James W. MacDonald wrote:
>> > 
>> > 
>> > 
>> > bbslover wrote:
>> >> 
>> >> 
>> >> Steve Lianoglou-6 wrote:
>> >>> Hi,
>> >>>
>> >>> On Oct 22, 2009, at 2:35 PM, bbslover wrote:
>> >>>
>>  Usage
>>  data(gasoline)
>>  Format
>>  A data frame with 60 observations on the following 2 variables.
>>  octane
>>  a numeric vector. The octane number.
>>  NIR
>>  a matrix with 401 columns. The NIR spectrum
>> 
>>  and I see the gasoline data to see below
>>  NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm 
> NIR.1696 
>>  nm
>>  NIR.1698 nm NIR.1700 nm
>>  1 1.242645 1.250789 1.246626 1.250985 1.264189 1.244678 1.245913 
>>  1.221135
>>  2 1.189116 1.223242 1.253306 1.282889 1.215065 1.225211 1.227985 
>>  1.198851
>>  3 1.198287 1.237383 1.260979 1.276677 1.218871 1.223132 1.230321 
>>  1.208742
>>  4 1.201066 1.233299 1.262966 1.272709 1.211068 1.215044 1.232655 
>>  1.206696
>>  5 1.259616 1.273713 1.296524 1.299507 1.226448 1.230718 1.232864 
>>  1.202926
>>  6 1.24109 1.262138 1.288401 1.291118 1.229769 1.227615 1.22763 
>>  1.207576
>>  7 1.245143 1.265648 1.274731 1.292441 1.218317 1.218147 1.73 
>>  1.200446
>>  8 1.222581 1.245782 1.26002 1.290305 1.221264 1.220265 1.227947 
>>  1.188174
>>  9 1.234969 1.251559 1.272416 1.287405 1.211995 1.213263 1.215883 
>>  1.196102
>> 
>>  look at this NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR. 
>>  1694 nm
>>  NIR.1696 nm NIR.1698 nm NIR.1700 nm
>> 
>>  how can I add letters NIR to my variable, because my 600 
>>  independents never
>>  have NIR as the prefix. however, it is needed to model the plsr. 
> for
>>  example aa=plsr(y~NIR, data=data ,), the prefix NIR is 
>>  necessary, how
>>  can I do with it?
>> >>> I'm not really sue that I'm getting you, but if your problem is that 
>  
>> >>> the column names of your data.frame don't match the variable names 
>> >>> you'd like to use in your formula, just change the colnames of your 
>> >>> data.frame to match your formula.
>> >>>
>> >>> BTW - I have no idea where to get this gasoline data set, so I'm 
> just 
>> >>> imagining:
>> >>>
>> >>> eg.
>> >>> colnames(gasoline) <- c('put', 'the', 'variable', 'names', 'that', 
>> >>> 'you', 'want', 'here')
>> >>>
>> >>> -steve
>> >>>
>> >>> --
>> >>> Steve Lianoglou
>> >>> Graduate Student: Computational Systems Biology
>> >>>|  Memorial Sloan-Kettering Cancer Center
>> >>>|  Weill Medical College of Cornell University
>> >>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>> >>>
>> >>> __
>> >>> R-help@r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>>
>> >> 
>> >> thanks for you. but the numbers of indenpendence are so many, it is 
> not
>> >> easy
>> >> to identify them one by one,  is there some better way?
>> > 
>> > You don't need to identify anything. What you need to do is read the 
>> > help page for the function you want to use, so you (at the very least) 
> 
>> > know how to use the function.
>> > 
>> >  > librar

Re: [R] data frame is killing me! help

2009-10-26 Thread Petr PIKAL

Hi

> data(gasoline)
> str(gasoline)
'data.frame':   60 obs. of  2 variables:
 $ octane: num  85.3 85.2 88.5 83.4 87.9 ...
 $ NIR   : AsIs [1:60, 1:401] -0.050193 -0.044227 -0.046867 -0.046705 
-0.050859 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr  "1" "2" "3" "4" ...
  .. ..$ : chr  "900 nm" "902 nm" "904 nm" "906 nm" ...
> str(gasoline$NIR)
 AsIs [1:60, 1:401] -0.050193 -0.044227 -0.046867 -0.046705 -0.050859 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:60] "1" "2" "3" "4" ...
  ..$ : chr [1:401] "900 nm" "902 nm" "904 nm" "906 nm" ...
> is.matrix(gasoline$NIR)
[1] TRUE

so the second element of gasoline data frame is a matrix

> ?AsIs

> df<-data.frame(x=1:5, I(matrix(rnorm(10), 5,2)))
> df
  x matrix.rnorm.10...5..2..1 matrix.rnorm.10...5..2..2
1 1  0.187703  0.213312
2 2  -0.66264  -0.47941
3 3  -0.82334  -0.04324
4 4  -0.37255  0.883027
5 5  -0.28700  -1.03431
> str(df)
'data.frame':   5 obs. of  2 variables:
 $ x  : int  1 2 3 4 5
 $ matrix.rnorm.10...5..2.: AsIs [1:5, 1:2] 0.187703 -0.66264 
-0.82334 -0.37255 -0.28700 ...
> 

Regards
Petr

r-help-boun...@r-project.org napsal dne 23.10.2009 18:43:56:

> 
> I have read that one ,I want to this method to be used to my data.but I 
donot
> know how to put my data into R. 
> 
> James W. MacDonald wrote:
> > 
> > 
> > 
> > bbslover wrote:
> >> 
> >> 
> >> Steve Lianoglou-6 wrote:
> >>> Hi,
> >>>
> >>> On Oct 22, 2009, at 2:35 PM, bbslover wrote:
> >>>
>  Usage
>  data(gasoline)
>  Format
>  A data frame with 60 observations on the following 2 variables.
>  octane
>  a numeric vector. The octane number.
>  NIR
>  a matrix with 401 columns. The NIR spectrum
> 
>  and I see the gasoline data to see below
>  NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm 
NIR.1696 
>  nm
>  NIR.1698 nm NIR.1700 nm
>  1 1.242645 1.250789 1.246626 1.250985 1.264189 1.244678 1.245913 
>  1.221135
>  2 1.189116 1.223242 1.253306 1.282889 1.215065 1.225211 1.227985 
>  1.198851
>  3 1.198287 1.237383 1.260979 1.276677 1.218871 1.223132 1.230321 
>  1.208742
>  4 1.201066 1.233299 1.262966 1.272709 1.211068 1.215044 1.232655 
>  1.206696
>  5 1.259616 1.273713 1.296524 1.299507 1.226448 1.230718 1.232864 
>  1.202926
>  6 1.24109 1.262138 1.288401 1.291118 1.229769 1.227615 1.22763 
>  1.207576
>  7 1.245143 1.265648 1.274731 1.292441 1.218317 1.218147 1.73 
>  1.200446
>  8 1.222581 1.245782 1.26002 1.290305 1.221264 1.220265 1.227947 
>  1.188174
>  9 1.234969 1.251559 1.272416 1.287405 1.211995 1.213263 1.215883 
>  1.196102
> 
>  look at this NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR. 
>  1694 nm
>  NIR.1696 nm NIR.1698 nm NIR.1700 nm
> 
>  how can I add letters NIR to my variable, because my 600 
>  independents never
>  have NIR as the prefix. however, it is needed to model the plsr. 
for
>  example aa=plsr(y~NIR, data=data ,), the prefix NIR is 
>  necessary, how
>  can I do with it?
> >>> I'm not really sue that I'm getting you, but if your problem is that 
 
> >>> the column names of your data.frame don't match the variable names 
> >>> you'd like to use in your formula, just change the colnames of your 
> >>> data.frame to match your formula.
> >>>
> >>> BTW - I have no idea where to get this gasoline data set, so I'm 
just 
> >>> imagining:
> >>>
> >>> eg.
> >>> colnames(gasoline) <- c('put', 'the', 'variable', 'names', 'that', 
> >>> 'you', 'want', 'here')
> >>>
> >>> -steve
> >>>
> >>> --
> >>> Steve Lianoglou
> >>> Graduate Student: Computational Systems Biology
> >>>|  Memorial Sloan-Kettering Cancer Center
> >>>|  Weill Medical College of Cornell University
> >>> Contact Info: http://cbio.mskcc.org/~lianos/contact
> >>>
> >>> __
> >>> R-help@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained, reproducible code.
> >>>
> >>>
> >> 
> >> thanks for you. but the numbers of indenpendence are so many, it is 
not
> >> easy
> >> to identify them one by one,  is there some better way?
> > 
> > You don't need to identify anything. What you need to do is read the 
> > help page for the function you want to use, so you (at the very least) 

> > know how to use the function.
> > 
> >  > library(pls)
> >  > data(gasoline)
> >  > fit <- plsr(octane~NIR, data=gasoline, validation = "CV")
> >  > summary(fit)
> > Data:X dimension: 60 401
> >Y dimension: 60 1
> > Fit method: kernelpls
> > Number of components considered: 53
> > 
> > VALIDATION: RMSEP
>

Re: [R] Data format for KSVM

2009-10-24 Thread Uwe Ligges




Noah Silverman wrote:

Hi,

I have a process using svm from the e1071 library. 


It's called a *package* which is probably installed in a *library* of 
packages.




it works.

I want to try using the KSVM library instead.  The same data used wiht 
e1071 gives me an error with KSVM.


I guess you are talking about the ksvm *function* in *package* kernlab 
now, right?





My data is a data.frame.

sample code:

svm_formula <- formula(y ~ a + B + C)



You do not use svm_function below, do you?


svm_model <- ksvm(formula, data=train_data, type="C-svc", 
kernel="rbfdot", C=1)

>

I get the following error:

"object is not a matrix"



ksvm works for me. Please specify a reproducible example (including the 
data) or give us at least the output of str(data)  and specofy which 
verions of R and kernlab you are talking about.


Uwe Ligges





So I tried this:

svm_model <- ksvm(formula, data=as.matrix(train_data), type="C-svc", 
kernel="rbfdot", C=1, scaled=FALSE)


Now I get this error:
"Error in model.fram.definition(data = list(v1 = c(1.1234, -2.3232:
Object is not a matrix

My data was previously scaled with the scale() function so that the mean 
is centered at 0. and the range is {-1,1}


Can anyone provide some suggestions as to why I'm getting an error?

Thanks!

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame is killing me! help

2009-10-24 Thread bbslover


thank you Don MacQueen , I will try it.


Don MacQueen wrote:
> 
> At 4:57 AM -0700 10/23/09, bbslover wrote:
>>Steve Lianoglou-6 wrote:
>>>
>>>  Hi,
>>>
>>>  On Oct 22, 2009, at 2:35 PM, bbslover wrote:
>>>
  Usage
  data(gasoline)
  Format
  A data frame with 60 observations on the following 2 variables.
  octane
  a numeric vector. The octane number.
  NIR
  a matrix with 401 columns. The NIR spectrum

  and I see the gasoline data to see below
  NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm NIR.1696 
  nm
  NIR.1698 nm NIR.1700 nm
  1 1.242645 1.250789 1.246626 1.250985 1.264189 1.244678 1.245913 
  1.221135
  2 1.189116 1.223242 1.253306 1.282889 1.215065 1.225211 1.227985 
  1.198851
  3 1.198287 1.237383 1.260979 1.276677 1.218871 1.223132 1.230321 
  1.208742
  4 1.201066 1.233299 1.262966 1.272709 1.211068 1.215044 1.232655 
  1.206696
  5 1.259616 1.273713 1.296524 1.299507 1.226448 1.230718 1.232864 
  1.202926
  6 1.24109 1.262138 1.288401 1.291118 1.229769 1.227615 1.22763 
  1.207576
  7 1.245143 1.265648 1.274731 1.292441 1.218317 1.218147 1.73 
  1.200446
  8 1.222581 1.245782 1.26002 1.290305 1.221264 1.220265 1.227947 
  1.188174
  9 1.234969 1.251559 1.272416 1.287405 1.211995 1.213263 1.215883 
  1.196102

  look at this NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.
  1694 nm
  NIR.1696 nm NIR.1698 nm NIR.1700 nm

  how can I add letters NIR to my variable, because my 600 
  independents never
  have NIR as the prefix. however, it is needed to model the plsr.   for
  example aa=plsr(y~NIR, data=data ,), the prefix NIR is 
  necessary, how
>>  >> can I do with it?
> 
> Perhaps using paste(). Maybe something like:
> 
> paste('NIR', 1:600,sep=''.)
> or
> paste('NIR', seq(1686,1700,2),sep='.')
> 
>>  >
>>>  I'm not really sue that I'm getting you, but if your problem is that 
>>>  the column names of your data.frame don't match the variable names 
>>>  you'd like to use in your formula, just change the colnames of your 
>>>  data.frame to match your formula.
>>>
>>>  BTW - I have no idea where to get this gasoline data set, so I'm just 
>>>  imagining:
>>>
>>>  eg.
>>>  colnames(gasoline) <- c('put', 'the', 'variable', 'names', 'that', 
>>>  'you', 'want', 'here')
>>>
>>>  -steve
>>>
>>>  --
>>>  Steve Lianoglou
>>>  Graduate Student: Computational Systems Biology
>>> |  Memorial Sloan-Kettering Cancer Center
>>> |  Weill Medical College of Cornell University
>>>  Contact Info: http://*cbio.mskcc.org/~lianos/contact
>>>
>>>  __
>>>  R-help@r-project.org mailing list
>>>  https://*stat.ethz.ch/mailman/listinfo/r-help
>>>  PLEASE do read the posting guide
>>>  http://*www.*R-project.org/posting-guide.html
>>>  and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>thanks for you. but the numbers of indenpendence are so many, it is not
easy
>>to identify them one by one,  is there some better way?
>>
>>
>>--
>>View this message in context: 
>>http://*www.*nabble.com/data-frame-is-killing-me%21-help-tp26015079p26024985.html
>>Sent from the R help mailing list archive at Nabble.com.
>>
>>__
>>R-help@r-project.org mailing list
>>https://*stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
http://*www.*R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
> 
> 
> -- 
> -
> Don MacQueen
> Lawrence Livermore National Laboratory
> Livermore, CA, USA
> 925-423-1062
> m...@llnl.gov
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/data-frame-is-killing-me%21-help-tp26015079p26036836.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame is killing me! help

2009-10-24 Thread bbslover


I have try it, past can add to wanted letter, but can not past the colume
names. May be I should learn it hard.

Don MacQueen wrote:
> 
> At 4:57 AM -0700 10/23/09, bbslover wrote:
>>Steve Lianoglou-6 wrote:
>>>
>>>  Hi,
>>>
>>>  On Oct 22, 2009, at 2:35 PM, bbslover wrote:
>>>
  Usage
  data(gasoline)
  Format
  A data frame with 60 observations on the following 2 variables.
  octane
  a numeric vector. The octane number.
  NIR
  a matrix with 401 columns. The NIR spectrum

  and I see the gasoline data to see below
  NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm NIR.1696 
  nm
  NIR.1698 nm NIR.1700 nm
  1 1.242645 1.250789 1.246626 1.250985 1.264189 1.244678 1.245913 
  1.221135
  2 1.189116 1.223242 1.253306 1.282889 1.215065 1.225211 1.227985 
  1.198851
  3 1.198287 1.237383 1.260979 1.276677 1.218871 1.223132 1.230321 
  1.208742
  4 1.201066 1.233299 1.262966 1.272709 1.211068 1.215044 1.232655 
  1.206696
  5 1.259616 1.273713 1.296524 1.299507 1.226448 1.230718 1.232864 
  1.202926
  6 1.24109 1.262138 1.288401 1.291118 1.229769 1.227615 1.22763 
  1.207576
  7 1.245143 1.265648 1.274731 1.292441 1.218317 1.218147 1.73 
  1.200446
  8 1.222581 1.245782 1.26002 1.290305 1.221264 1.220265 1.227947 
  1.188174
  9 1.234969 1.251559 1.272416 1.287405 1.211995 1.213263 1.215883 
  1.196102

  look at this NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.
  1694 nm
  NIR.1696 nm NIR.1698 nm NIR.1700 nm

  how can I add letters NIR to my variable, because my 600 
  independents never
  have NIR as the prefix. however, it is needed to model the plsr.   for
  example aa=plsr(y~NIR, data=data ,), the prefix NIR is 
  necessary, how
>>  >> can I do with it?
> 
> Perhaps using paste(). Maybe something like:
> 
> paste('NIR', 1:600,sep=''.)
> or
> paste('NIR', seq(1686,1700,2),sep='.')
> 
>>  >
>>>  I'm not really sue that I'm getting you, but if your problem is that 
>>>  the column names of your data.frame don't match the variable names 
>>>  you'd like to use in your formula, just change the colnames of your 
>>>  data.frame to match your formula.
>>>
>>>  BTW - I have no idea where to get this gasoline data set, so I'm just 
>>>  imagining:
>>>
>>>  eg.
>>>  colnames(gasoline) <- c('put', 'the', 'variable', 'names', 'that', 
>>>  'you', 'want', 'here')
>>>
>>>  -steve
>>>
>>>  --
>>>  Steve Lianoglou
>>>  Graduate Student: Computational Systems Biology
>>> |  Memorial Sloan-Kettering Cancer Center
>>> |  Weill Medical College of Cornell University
>>>  Contact Info: http://*cbio.mskcc.org/~lianos/contact
>>>
>>>  __
>>>  R-help@r-project.org mailing list
>>>  https://*stat.ethz.ch/mailman/listinfo/r-help
>>>  PLEASE do read the posting guide
>>>  http://*www.*R-project.org/posting-guide.html
>>>  and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>thanks for you. but the numbers of indenpendence are so many, it is not
easy
>>to identify them one by one,  is there some better way?
>>
>>
>>--
>>View this message in context: 
>>http://*www.*nabble.com/data-frame-is-killing-me%21-help-tp26015079p26024985.html
>>Sent from the R help mailing list archive at Nabble.com.
>>
>>__
>>R-help@r-project.org mailing list
>>https://*stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide
http://*www.*R-project.org/posting-guide.html
>>and provide commented, minimal, self-contained, reproducible code.
> 
> 
> -- 
> -
> Don MacQueen
> Lawrence Livermore National Laboratory
> Livermore, CA, USA
> 925-423-1062
> m...@llnl.gov
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/data-frame-is-killing-me%21-help-tp26015079p26036875.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame is killing me! help

2009-10-23 Thread bbslover


I have read that one ,I want to this method to be used to my data.but I donot
know how to put my data into R. 

James W. MacDonald wrote:
> 
> 
> 
> bbslover wrote:
>> 
>> 
>> Steve Lianoglou-6 wrote:
>>> Hi,
>>>
>>> On Oct 22, 2009, at 2:35 PM, bbslover wrote:
>>>
 Usage
 data(gasoline)
 Format
 A data frame with 60 observations on the following 2 variables.
 octane
 a numeric vector. The octane number.
 NIR
 a matrix with 401 columns. The NIR spectrum

 and I see the gasoline data to see below
 NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm NIR.1696  
 nm
 NIR.1698 nm NIR.1700 nm
 1 1.242645 1.250789 1.246626 1.250985 1.264189 1.244678 1.245913  
 1.221135
 2 1.189116 1.223242 1.253306 1.282889 1.215065 1.225211 1.227985  
 1.198851
 3 1.198287 1.237383 1.260979 1.276677 1.218871 1.223132 1.230321  
 1.208742
 4 1.201066 1.233299 1.262966 1.272709 1.211068 1.215044 1.232655  
 1.206696
 5 1.259616 1.273713 1.296524 1.299507 1.226448 1.230718 1.232864  
 1.202926
 6 1.24109 1.262138 1.288401 1.291118 1.229769 1.227615 1.22763  
 1.207576
 7 1.245143 1.265648 1.274731 1.292441 1.218317 1.218147 1.73  
 1.200446
 8 1.222581 1.245782 1.26002 1.290305 1.221264 1.220265 1.227947  
 1.188174
 9 1.234969 1.251559 1.272416 1.287405 1.211995 1.213263 1.215883  
 1.196102

 look at this NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR. 
 1694 nm
 NIR.1696 nm NIR.1698 nm NIR.1700 nm

 how can I add letters NIR to my variable, because my 600  
 independents never
 have NIR as the prefix. however, it is needed to model the plsr.   for
 example aa=plsr(y~NIR, data=data ,), the prefix NIR is  
 necessary, how
 can I do with it?
>>> I'm not really sue that I'm getting you, but if your problem is that  
>>> the column names of your data.frame don't match the variable names  
>>> you'd like to use in your formula, just change the colnames of your  
>>> data.frame to match your formula.
>>>
>>> BTW - I have no idea where to get this gasoline data set, so I'm just  
>>> imagining:
>>>
>>> eg.
>>> colnames(gasoline) <- c('put', 'the', 'variable', 'names', 'that',  
>>> 'you', 'want', 'here')
>>>
>>> -steve
>>>
>>> --
>>> Steve Lianoglou
>>> Graduate Student: Computational Systems Biology
>>>|  Memorial Sloan-Kettering Cancer Center
>>>|  Weill Medical College of Cornell University
>>> Contact Info: http://cbio.mskcc.org/~lianos/contact
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>> 
>> thanks for you. but the numbers of indenpendence are so many, it is not
>> easy
>> to identify them one by one,  is there some better way?
> 
> You don't need to identify anything. What you need to do is read the 
> help page for the function you want to use, so you (at the very least) 
> know how to use the function.
> 
>  > library(pls)
>  > data(gasoline)
>  > fit <- plsr(octane~NIR, data=gasoline, validation = "CV")
>  > summary(fit)
> Data: X dimension: 60 401
>   Y dimension: 60 1
> Fit method: kernelpls
> Number of components considered: 53
> 
> VALIDATION: RMSEP
> Cross-validated using 10 random segments.
> (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
> CV   1.5431.372   0.3827   0.2522   0.2347   0.2455   0.2281
> adjCV1.5431.367   0.3740   0.2497   0.2360   0.2407   0.2243
> 7 comps  8 comps  9 comps  10 comps  11 comps  12 comps  13 comps
> CV  0.2311   0.2352   0.24550.25340.27370.28140.2832
> adjCV   0.2257   0.2303   0.23950.24730.26460.27050.2726
> 14 comps  15 comps  16 comps  17 comps  18 comps  19 comps  20
> comps
> CV   0.29130.29320.29850.31370.32890.3323   
> 0.3391
> adjCV0.28080.28210.28630.30080.31410.3172   
> 0.3228
> 21 comps  22 comps  23 comps  24 comps  25 comps  26 comps  27
> comps
> CV   0.34760.33840.33160.32130.31550.3118   
> 0.3062
> adjCV0.33070.32170.31540.30570.30020.2964   
> 0.2908
> 28 comps  29 comps  30 comps  31 comps  32 comps  33 comps  34
> comps
> CV   0.30330.30340.30740.30830.30940.3087   
> 0.3105
> adjCV0.28810.28810.29170.29260.29360.2929   
> 0.2946
> 35 comps  36 comps  37 comps  38 comps  39 comps  40 comps  41
> comps
> CV   0.31080.31060.31050.31040.31040.3105   
> 0.3105
> adjCV0.29490.29470.29460.29450.29450.2945   
> 0.2946
> 42 comps  43 comps  44 comps  45 comps  46 comps  47 comps  48
> com

Re: [R] data frame is killing me! help

2009-10-23 Thread Don MacQueen

At 4:57 AM -0700 10/23/09, bbslover wrote:

Steve Lianoglou-6 wrote:

 Hi,

 On Oct 22, 2009, at 2:35 PM, bbslover wrote:

 Usage
 data(gasoline)
 Format
 A data frame with 60 observations on the following 2 variables.
 octane
 a numeric vector. The octane number.
 NIR
 a matrix with 401 columns. The NIR spectrum

 and I see the gasoline data to see below
 NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm NIR.1696 
 nm

 NIR.1698 nm NIR.1700 nm
 1 1.242645 1.250789 1.246626 1.250985 1.264189 1.244678 1.245913 
 1.221135
 2 1.189116 1.223242 1.253306 1.282889 1.215065 1.225211 1.227985 
 1.198851
 3 1.198287 1.237383 1.260979 1.276677 1.218871 1.223132 1.230321 
 1.208742
 4 1.201066 1.233299 1.262966 1.272709 1.211068 1.215044 1.232655 
 1.206696
 5 1.259616 1.273713 1.296524 1.299507 1.226448 1.230718 1.232864 
 1.202926
 6 1.24109 1.262138 1.288401 1.291118 1.229769 1.227615 1.22763 
 1.207576
 7 1.245143 1.265648 1.274731 1.292441 1.218317 1.218147 1.73 
 1.200446
 8 1.222581 1.245782 1.26002 1.290305 1.221264 1.220265 1.227947 
 1.188174
 9 1.234969 1.251559 1.272416 1.287405 1.211995 1.213263 1.215883 
 1.196102

 look at this NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.
 1694 nm
 NIR.1696 nm NIR.1698 nm NIR.1700 nm

 how can I add letters NIR to my variable, because my 600 
 independents never

 have NIR as the prefix. however, it is needed to model the plsr.   for
 example aa=plsr(y~NIR, data=data ,), the prefix NIR is 
 necessary, how

 >> can I do with it?

Perhaps using paste(). Maybe something like:

   paste('NIR', 1:600,sep=''.)
or
   paste('NIR', seq(1686,1700,2),sep='.')

 >
 I'm not really sue that I'm getting you, but if your problem is that 
 the column names of your data.frame don't match the variable names 
 you'd like to use in your formula, just change the colnames of your 
 data.frame to match your formula.

 BTW - I have no idea where to get this gasoline data set, so I'm just 
 imagining:

 eg.
 colnames(gasoline) <- c('put', 'the', 'variable', 'names', 'that', 
 'you', 'want', 'here')

 -steve

 --
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
|  Memorial Sloan-Kettering Cancer Center
|  Weill Medical College of Cornell University
 Contact Info: http://*cbio.mskcc.org/~lianos/contact

 __
 R-help@r-project.org mailing list
 https://*stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://*www.*R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

thanks for you. but the numbers of indenpendence are so many, it is not easy
to identify them one by one,  is there some better way?

--
View this message in context: 
http://*www.*nabble.com/data-frame-is-killing-me%21-help-tp26015079p26024985.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://*stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://*www.*R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
-
Don MacQueen
Lawrence Livermore National Laboratory
Livermore, CA, USA
925-423-1062
m...@llnl.gov

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data format for KSVM

2009-10-23 Thread Noah Silverman


Hi,

I have a process using svm from the e1071 library.  it works.

I want to try using the KSVM library instead.  The same data used wiht 
e1071 gives me an error with KSVM.


My data is a data.frame.

sample code:

svm_formula <- formula(y ~ a + B + C)

svm_model <- ksvm(formula, data=train_data, type="C-svc", 
kernel="rbfdot", C=1)


I get the following error:

"object is not a matrix"

So I tried this:

svm_model <- ksvm(formula, data=as.matrix(train_data), type="C-svc", 
kernel="rbfdot", C=1, scaled=FALSE)


Now I get this error:
"Error in model.fram.definition(data = list(v1 = c(1.1234, -2.3232:
Object is not a matrix

My data was previously scaled with the scale() function so that the mean 
is centered at 0. and the range is {-1,1}


Can anyone provide some suggestions as to why I'm getting an error?

Thanks!

-N

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data frame is killing me! help

2009-10-23 Thread James W. MacDonald




bbslover wrote:



Steve Lianoglou-6 wrote:

Hi,

On Oct 22, 2009, at 2:35 PM, bbslover wrote:


Usage
data(gasoline)
Format
A data frame with 60 observations on the following 2 variables.
octane
a numeric vector. The octane number.
NIR
a matrix with 401 columns. The NIR spectrum

and I see the gasoline data to see below
NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm NIR.1696  
nm

NIR.1698 nm NIR.1700 nm
1 1.242645 1.250789 1.246626 1.250985 1.264189 1.244678 1.245913  
1.221135
2 1.189116 1.223242 1.253306 1.282889 1.215065 1.225211 1.227985  
1.198851
3 1.198287 1.237383 1.260979 1.276677 1.218871 1.223132 1.230321  
1.208742
4 1.201066 1.233299 1.262966 1.272709 1.211068 1.215044 1.232655  
1.206696
5 1.259616 1.273713 1.296524 1.299507 1.226448 1.230718 1.232864  
1.202926
6 1.24109 1.262138 1.288401 1.291118 1.229769 1.227615 1.22763  
1.207576
7 1.245143 1.265648 1.274731 1.292441 1.218317 1.218147 1.73  
1.200446
8 1.222581 1.245782 1.26002 1.290305 1.221264 1.220265 1.227947  
1.188174
9 1.234969 1.251559 1.272416 1.287405 1.211995 1.213263 1.215883  
1.196102


look at this NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR. 
1694 nm

NIR.1696 nm NIR.1698 nm NIR.1700 nm

how can I add letters NIR to my variable, because my 600  
independents never

have NIR as the prefix. however, it is needed to model the plsr.   for
example aa=plsr(y~NIR, data=data ,), the prefix NIR is  
necessary, how

can I do with it?
I'm not really sue that I'm getting you, but if your problem is that  
the column names of your data.frame don't match the variable names  
you'd like to use in your formula, just change the colnames of your  
data.frame to match your formula.


BTW - I have no idea where to get this gasoline data set, so I'm just  
imagining:


eg.
colnames(gasoline) <- c('put', 'the', 'variable', 'names', 'that',  
'you', 'want', 'here')


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
   |  Memorial Sloan-Kettering Cancer Center
   |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




thanks for you. but the numbers of indenpendence are so many, it is not easy
to identify them one by one,  is there some better way?


You don't need to identify anything. What you need to do is read the 
help page for the function you want to use, so you (at the very least) 
know how to use the function.


> library(pls)
> data(gasoline)
> fit <- plsr(octane~NIR, data=gasoline, validation = "CV")
> summary(fit)
Data:   X dimension: 60 401
Y dimension: 60 1
Fit method: kernelpls
Number of components considered: 53

VALIDATION: RMSEP
Cross-validated using 10 random segments.
   (Intercept)  1 comps  2 comps  3 comps  4 comps  5 comps  6 comps
CV   1.5431.372   0.3827   0.2522   0.2347   0.2455   0.2281
adjCV1.5431.367   0.3740   0.2497   0.2360   0.2407   0.2243
   7 comps  8 comps  9 comps  10 comps  11 comps  12 comps  13 comps
CV  0.2311   0.2352   0.24550.25340.27370.28140.2832
adjCV   0.2257   0.2303   0.23950.24730.26460.27050.2726
   14 comps  15 comps  16 comps  17 comps  18 comps  19 comps  20 comps
CV   0.29130.29320.29850.31370.32890.33230.3391
adjCV0.28080.28210.28630.30080.31410.31720.3228
   21 comps  22 comps  23 comps  24 comps  25 comps  26 comps  27 comps
CV   0.34760.33840.33160.32130.31550.31180.3062
adjCV0.33070.32170.31540.30570.30020.29640.2908
   28 comps  29 comps  30 comps  31 comps  32 comps  33 comps  34 comps
CV   0.30330.30340.30740.30830.30940.30870.3105
adjCV0.28810.28810.29170.29260.29360.29290.2946
   35 comps  36 comps  37 comps  38 comps  39 comps  40 comps  41 comps
CV   0.31080.31060.31050.31040.31040.31050.3105
adjCV0.29490.29470.29460.29450.29450.29450.2946
   42 comps  43 comps  44 comps  45 comps  46 comps  47 comps  48 comps
CV   0.31050.31050.31050.31050.31050.31050.3105
adjCV0.29460.29460.29460.29460.29460.29460.2946
   49 comps  50 comps  51 comps  52 comps  53 comps
CV   0.31050.31050.31050.31050.3105
adjCV0.29460.29460.29460.29460.2946

TRAINING: % variance explained
1 comps  2 comps  3 comps  4 comps  5 comps  6 comps  7 comps 
8 comps
X 70.9778.5686.15 95.496.1296.9797.32 
  98.1
octane31.9094.6697.71 98.098.6898.9399.06 
  99.1

Re: [R] data frame is killing me! help

2009-10-23 Thread bbslover




Steve Lianoglou-6 wrote:
> 
> Hi,
> 
> On Oct 22, 2009, at 2:35 PM, bbslover wrote:
> 
>> Usage
>> data(gasoline)
>> Format
>> A data frame with 60 observations on the following 2 variables.
>> octane
>> a numeric vector. The octane number.
>> NIR
>> a matrix with 401 columns. The NIR spectrum
>>
>> and I see the gasoline data to see below
>> NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR.1694 nm NIR.1696  
>> nm
>> NIR.1698 nm NIR.1700 nm
>> 1 1.242645 1.250789 1.246626 1.250985 1.264189 1.244678 1.245913  
>> 1.221135
>> 2 1.189116 1.223242 1.253306 1.282889 1.215065 1.225211 1.227985  
>> 1.198851
>> 3 1.198287 1.237383 1.260979 1.276677 1.218871 1.223132 1.230321  
>> 1.208742
>> 4 1.201066 1.233299 1.262966 1.272709 1.211068 1.215044 1.232655  
>> 1.206696
>> 5 1.259616 1.273713 1.296524 1.299507 1.226448 1.230718 1.232864  
>> 1.202926
>> 6 1.24109 1.262138 1.288401 1.291118 1.229769 1.227615 1.22763  
>> 1.207576
>> 7 1.245143 1.265648 1.274731 1.292441 1.218317 1.218147 1.73  
>> 1.200446
>> 8 1.222581 1.245782 1.26002 1.290305 1.221264 1.220265 1.227947  
>> 1.188174
>> 9 1.234969 1.251559 1.272416 1.287405 1.211995 1.213263 1.215883  
>> 1.196102
>>
>> look at this NIR.1686 nm NIR.1688 nm NIR.1690 nm NIR.1692 nm NIR. 
>> 1694 nm
>> NIR.1696 nm NIR.1698 nm NIR.1700 nm
>>
>> how can I add letters NIR to my variable, because my 600  
>> independents never
>> have NIR as the prefix. however, it is needed to model the plsr.   for
>> example aa=plsr(y~NIR, data=data ,), the prefix NIR is  
>> necessary, how
>> can I do with it?
> 
> I'm not really sue that I'm getting you, but if your problem is that  
> the column names of your data.frame don't match the variable names  
> you'd like to use in your formula, just change the colnames of your  
> data.frame to match your formula.
> 
> BTW - I have no idea where to get this gasoline data set, so I'm just  
> imagining:
> 
> eg.
> colnames(gasoline) <- c('put', 'the', 'variable', 'names', 'that',  
> 'you', 'want', 'here')
> 
> -steve
> 
> --
> Steve Lianoglou
> Graduate Student: Computational Systems Biology
>|  Memorial Sloan-Kettering Cancer Center
>|  Weill Medical College of Cornell University
> Contact Info: http://cbio.mskcc.org/~lianos/contact
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 

thanks for you. but the numbers of indenpendence are so many, it is not easy
to identify them one by one,  is there some better way?


-- 
View this message in context: 
http://www.nabble.com/data-frame-is-killing-me%21-help-tp26015079p26024985.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data file with columns of unequal length

2009-10-23 Thread Jim Lemon


On 10/23/2009 10:07 PM, William Simpson wrote:

Thanks Jim. BTW the times in x and y are in ascending order (time of
occurrence).

If I do it this way, how do I actually read the data in and store in
the file? Toy code, please.



Hi Bill,
This seems a bit like some heartbeat data that I had to deal with some 
years ago. There were two output streams, one the time of each R wave 
and the other an asynchronous stimulus. In that case, I had to work out 
code to interdigitate the signals and do some processing of the 
resulting data. It sounds like you have two files, each recording times 
of input and output respectively one value to a line. If so, my first 
guess is:


x<-as.vector(read.table("input_times.dat"))
y<-as.vector(read.table("output_times.dat"))
xy<-list(x,y)
# to store this list in a file
# oops, bit too quick on the keyboard
dput(xy,file="xy_23_10_2009.R")

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data file with columns of unequal length

2009-10-23 Thread Jim Lemon


On 10/23/2009 10:07 PM, William Simpson wrote:

Thanks Jim. BTW the times in x and y are in ascending order (time of
occurrence).

If I do it this way, how do I actually read the data in and store in
the file? Toy code, please.

   

Hi Bill,
This seems a bit like some heartbeat data that I had to deal with some 
years ago. There were two output streams, one the time of each R wave 
and the other an asynchronous stimulus. In that case, I had to work out 
code to interdigitate the signals and do some processing of the 
resulting data. It sounds like you have two files, each recording times 
of input and output respectively one value to a line. If so, my first 
guess is:


x<-as.vector(read.table("input_times.dat"))
y<-as.vector(read.table("output_times.dat"))
xy<-list(x,y)
# to store this list in a file
write.csv(xy,file="xy_23_10_2009.csv")

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data file with columns of unequal length

2009-10-23 Thread William Simpson

OK thanks, I look at sleep and get it

Bill

On Fri, Oct 23, 2009 at 12:21 PM, Peter Dalgaard
 wrote:
> William Simpson wrote:
>>> As I understand it, they don't come in pairs anyway.
>> Correct.
>>
>>> For the same reason
>>> a data frame is just the wrong kind of data structure. If you don't want
>>> separate data files, you can use one file with two columns where the
>>> second column is (say) 1 for the x and 2 for the y.
>> Could you explain this further? I don't really get it.
>> My xs and ys have different lengths. How would I read them in?
>
> Same format as when you have data from two different groups. Don't
> really know how to make it clearer than that. Like the "sleep" dataset
> (OK, so they are actually paired...).
>
>>
>> Ideally I would just read in the x column and y columns separately
>>
>> x<-read.file("file.dat", column1)
>> y<-read.file("file.dat", column2)
>> But I know of no way to do that...
>>
>> [Oh, by the way, I never mentioned this, the x and ys are event times
>> and are in ascending order (by time of occurrence).]
>>
>>> (There are other options, like concatenating the x and the y with some
>>> sort of separator inbetween, but it easily gets painful to read them
>>> back in.)
>
> --
>   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
> ~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data file with columns of unequal length

2009-10-23 Thread Peter Dalgaard

William Simpson wrote:
>> As I understand it, they don't come in pairs anyway.
> Correct.
> 
>> For the same reason
>> a data frame is just the wrong kind of data structure. If you don't want
>> separate data files, you can use one file with two columns where the
>> second column is (say) 1 for the x and 2 for the y.
> Could you explain this further? I don't really get it.
> My xs and ys have different lengths. How would I read them in?

Same format as when you have data from two different groups. Don't
really know how to make it clearer than that. Like the "sleep" dataset
(OK, so they are actually paired...).

> 
> Ideally I would just read in the x column and y columns separately
> 
> x<-read.file("file.dat", column1)
> y<-read.file("file.dat", column2)
> But I know of no way to do that...
> 
> [Oh, by the way, I never mentioned this, the x and ys are event times
> and are in ascending order (by time of occurrence).]
> 
>> (There are other options, like concatenating the x and the y with some
>> sort of separator inbetween, but it easily gets painful to read them
>> back in.)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data file with columns of unequal length

2009-10-23 Thread William Simpson

Thanks Jim. BTW the times in x and y are in ascending order (time of
occurrence).

If I do it this way, how do I actually read the data in and store in
the file? Toy code, please.

Bill

>
> Hi Bill,
>
> xy<-list(x=1:10,y=1:100)
>
> Note that this cheerfully ignores how you are going to figure out which x
> goes with which y(s).
>
> Jim
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data file with columns of unequal length

2009-10-23 Thread William Simpson

> As I understand it, they don't come in pairs anyway.
Correct.

> For the same reason
> a data frame is just the wrong kind of data structure. If you don't want
> separate data files, you can use one file with two columns where the
> second column is (say) 1 for the x and 2 for the y.
Could you explain this further? I don't really get it.
My xs and ys have different lengths. How would I read them in?

Ideally I would just read in the x column and y columns separately

x<-read.file("file.dat", column1)
y<-read.file("file.dat", column2)
But I know of no way to do that...

[Oh, by the way, I never mentioned this, the x and ys are event times
and are in ascending order (by time of occurrence).]

> (There are other options, like concatenating the x and the y with some
> sort of separator inbetween, but it easily gets painful to read them
> back in.)
>

Thanks
Bill

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data file with columns of unequal length

2009-10-23 Thread William Simpson

The way you do it is to compute the cross-intensity function (you can
google this; a key name is David Brillinger). The general problem is
that of system identification for point processes.

Bill

On Fri, Oct 23, 2009 at 10:31 AM, Jim Lemon  wrote:
> On 10/23/2009 07:58 PM, William Simpson wrote:
>>
>> I am running an expt that presents a point process input x and
>> measures a point process output y. The times of each event are
>> recorded. The lengths of the data records of x and y are necessarily
>> different, and can be different by a factor of 10. I would like to
>> save these data after each experiment as a file with two columns, one
>> for x and one for y.
>>
>> However, R dataframes require columns of equal length. One solution is
>> to fill the "empty" places in y with NAs so it has the same length as
>> x. I view that as unsatisfactory (there are in reality no missing
>> values). Another possibility is to store x and y in separate files. I
>> also view that as unsatisfactory (it is too easy to lose track of the
>> y file corresponding to a given x file).
>>
>> Can anyone suggest a way to deal with this situation?
>>
>>
>
> Hi Bill,
>
> xy<-list(x=1:10,y=1:100)
>
> Note that this cheerfully ignores how you are going to figure out which x
> goes with which y(s).
>
> Jim
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data file with columns of unequal length

2009-10-23 Thread Peter Dalgaard

Jim Lemon wrote:
> On 10/23/2009 07:58 PM, William Simpson wrote:
>> I am running an expt that presents a point process input x and
>> measures a point process output y. The times of each event are
>> recorded. The lengths of the data records of x and y are necessarily
>> different, and can be different by a factor of 10. I would like to
>> save these data after each experiment as a file with two columns, one
>> for x and one for y.
>>
>> However, R dataframes require columns of equal length. One solution is
>> to fill the "empty" places in y with NAs so it has the same length as
>> x. I view that as unsatisfactory (there are in reality no missing
>> values). Another possibility is to store x and y in separate files. I
>> also view that as unsatisfactory (it is too easy to lose track of the
>> y file corresponding to a given x file).
>>
>> Can anyone suggest a way to deal with this situation?
>>
>>
> Hi Bill,
> 
> xy<-list(x=1:10,y=1:100)
> 
> Note that this cheerfully ignores how you are going to figure out which
> x goes with which y(s).

As I understand it, they don't come in pairs anyway. For the same reason
a data frame is just the wrong kind of data structure. If you don't want
separate data files, you can use one file with two columns where the
second column is (say) 1 for the x and 2 for the y.

(There are other options, like concatenating the x and the y with some
sort of separator inbetween, but it easily gets painful to read them
back in.)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data file with columns of unequal length

2009-10-23 Thread Jim Lemon


On 10/23/2009 07:58 PM, William Simpson wrote:

I am running an expt that presents a point process input x and
measures a point process output y. The times of each event are
recorded. The lengths of the data records of x and y are necessarily
different, and can be different by a factor of 10. I would like to
save these data after each experiment as a file with two columns, one
for x and one for y.

However, R dataframes require columns of equal length. One solution is
to fill the "empty" places in y with NAs so it has the same length as
x. I view that as unsatisfactory (there are in reality no missing
values). Another possibility is to store x and y in separate files. I
also view that as unsatisfactory (it is too easy to lose track of the
y file corresponding to a given x file).

Can anyone suggest a way to deal with this situation?

   

Hi Bill,

xy<-list(x=1:10,y=1:100)

Note that this cheerfully ignores how you are going to figure out which 
x goes with which y(s).


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data file with columns of unequal length

2009-10-23 Thread William Simpson

I am running an expt that presents a point process input x and
measures a point process output y. The times of each event are
recorded. The lengths of the data records of x and y are necessarily
different, and can be different by a factor of 10. I would like to
save these data after each experiment as a file with two columns, one
for x and one for y.

However, R dataframes require columns of equal length. One solution is
to fill the "empty" places in y with NAs so it has the same length as
x. I view that as unsatisfactory (there are in reality no missing
values). Another possibility is to store x and y in separate files. I
also view that as unsatisfactory (it is too easy to lose track of the
y file corresponding to a given x file).

Can anyone suggest a way to deal with this situation?

Thanks very much for any help.

Bill

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

< 6 7 8 9 10 11 12 13 14 15 >

1001 - 1100 of 1446 matches

Mail list logo