Re: [R] how to ignore NA with NA or NULL

2012-06-06 Thread jeff6868
Hello,

I added your flags in my code but there are still errors.
Actually I tried some things:

- in function na.fill, I changed: 
if(all(!is.na(y[1:8700,1])))  return(NA)  to
if(all(!is.finite(y[1:8700,1])))  return(y) 
In order to have this file unchanged.

It has removed my dimension problem. I don't have errors anymore in:
 refill - process.all(lst, corhiver2008capt1) but  just some message
d'avis readable with warnings()

Then I noticed in refill (the object which should be filled with my code)
that files containing only NAs are turned as NULL in this object. So I have
0 rows for these objects instead of having them unchanged (35000 rows).
So when I transform it to data.frame, it doesn't work because of a new
dimension problem due to these NULL files.

But I don't understand where these files have been turned as NULL in my
code. Could you maybe tell me how can I have in output my only NA files
not as NULL but kept unchanged like at the beginning?
Thanks again.



--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287p4632506.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-06 Thread Jeff Newmiller
Please read the posting guide mentioned at the bottom of every message.

You might also benefit from reading 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-reproducible-example.
 We would certainly benefit from not having to guess what problems you are 
really encountering.

Also, it seems that you refer to in-memory data as files... this is imprecise 
and confusing. Learn to use the str() function to know what kinds of objects 
you are referring to... in this case I believe you are referring to data frames.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

jeff6868 geoffrey_kl...@etu.u-bourgogne.fr wrote:

Hello,

I added your flags in my code but there are still errors.
Actually I tried some things:

- in function na.fill, I changed: 
if(all(!is.na(y[1:8700,1])))  return(NA)  to
if(all(!is.finite(y[1:8700,1])))  return(y) 
In order to have this file unchanged.

It has removed my dimension problem. I don't have errors anymore in:
 refill - process.all(lst, corhiver2008capt1) but  just some message
d'avis readable with warnings()

Then I noticed in refill (the object which should be filled with my
code)
that files containing only NAs are turned as NULL in this object. So I
have
0 rows for these objects instead of having them unchanged (35000 rows).
So when I transform it to data.frame, it doesn't work because of a new
dimension problem due to these NULL files.

But I don't understand where these files have been turned as NULL in my
code. Could you maybe tell me how can I have in output my only NA
files
not as NULL but kept unchanged like at the beginning?
Thanks again.



--
View this message in context:
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287p4632506.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-06 Thread jeff6868
Ok Jeff, but then it'll be a big one. I'm working on a list of files and my
problem depends on different functions used previously. So it's very hard
for me to summarize to reproduct my error. But here is the reproductible
example with the error at the last line of the code (just copy and paste
it).
You'll notice that the data.frame with only NAs is set to NULL in refill,
and I just want to have it unchanged in output (so the same as input).
The aim of the function is to fill the NAs of my data.frames. It'll not work
in this example because there're only big NA gaps which are my problem for
the moment. But maybe now you can have an idea where the problem is (change
NULL for only NA DF in output to the same DF as in input).
For the example, we are just testing for x1.
Hope you have understood my problem now :)
Thanks Jeff, Rui or everyone else!

# my data for example
DF1 - data.frame(x1=rnorm(1:20),x2=c(31:50))
write.table(DF1,ST001_2008.csv,sep=;)
DF2 -
data.frame(x1=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,rnorm(1:10)),x2=c(1:20))
write.table(DF2,ST002_2008.csv,sep=;)
DF3 - data.frame(x1=rnorm(81:100),x2=NA)
write.table(DF3,ST003_2008.csv,sep=;)
DF4 - data.frame(x1=c(21:40),x2=rnorm(1:20))
write.table(DF4,ST004_2008.csv,sep=;)

#list my data
filenames - list.files(pattern=\\_2008.csv$)

Sensors - paste(x, 1:2,sep=)

Stations -substr(filenames,1,5)

nsensors - length(Sensors)
nstations - length(Stations)

nobs - nrow(read.table(filenames[1], header=TRUE))

yr2008 - array(NA, dim=c(nobs, nsensors, nstations))

for(i in seq_len(nstations)){
tmp - read.table(filenames[i], header=TRUE, sep=;)
yr2008[ , , i] - as.matrix(tmp[, Sensors])
}

dimnames(yr2008) - list(seq.int(nobs), Sensors, Stations)

yr2008capt1hiver-yr2008[1:10,1,]
yr2008capt1hiver - as.data.frame(yr2008capt1hiver)

#correlation between my data for x1 (for the example)
corhiver2008capt1 - cor(yr2008capt1hiver,use=pairwise.complete.obs)

capt1hiver - c(1:length(yr2008capt1hiver))

for(i in 1:length(capt1hiver))
{
   
if(sum(!is.na(yr2008capt1hiver[,capt1hiver[i]]))(length(yr2008capt1hiver[[capt1hiver[i]]])/2))
{
 corhiver2008capt1[i,]=NA
 corhiver2008capt1[,i]=NA
  }
}


lst - lapply(list.files(pattern=\\_2008.csv$), read.table,sep=;,
header=TRUE, stringsAsFactors=FALSE)
names(lst) - Stations

# searching the highest correlation for each data.Frame
get.max.cor - function(station, mat){
 mat[row(mat) == col(mat)] - -Inf
 m - max(mat[station, ],na.rm=TRUE)
 if (is.finite(m)) {return(which( mat[station, ] == m ))}
 else {return(NA)}
}

# fill the data.frame with the data.frame which has the highest
correlation coefficient
na.fill - function(x, y){
 if(all(!is.finite(y[1:10,1])))  return(y)
 i - is.na(x[1:10,1])
 xx - y[1:10,1]
 new - data.frame(xx=xx)
 x[1:10,1][i] - predict(lm(x[1:10,1]~xx, na.action=na.exclude),new)[i]
 x
}

process.all - function(df.list, mat){

f - function(station)
 na.fill(df.list[[ station ]], df.list[[ max.cor[station] ]])

g - function(station){
x - df.list[[station]]
if(any(!is.finite(x[1:10,1]))){
mat[row(mat) == col(mat)] - -Inf
nas - which(is.na(x[1:10,1]))
ord - order(mat[station, ], decreasing = TRUE)[-c(1,
ncol(mat))]
for(y in ord){
if(all(!is.na(df.list[[y]][1:10,1][nas]))){
xx - df.list[[y]][1:10,1]
new - data.frame(xx=xx)
x[1:10,1][nas] - predict(lm(x[1:10,1]~xx,
na.action=na.exclude), new)[nas]
break
}
}
}
x
}

n - length(df.list)
nms - names(df.list)
max.cor - sapply(seq.int(n), get.max.cor, corhiver2008capt1)
df.list - lapply(seq.int(n), f)
df.list - lapply(seq.int(n), g)
names(df.list) - nms
df.list
}

refill - process.all(lst, corhiver2008capt1)
refill - as.data.frame(refill) 
 
## HERE IS THE PROBLEM ##
head(refill)

--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287p4632527.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-06 Thread Jeff Newmiller
Still not clear what solution you would consider a success. On the one hand, 
you said you needed the NULLs, but you want one big data frame also.

Does

refill - refill[ -which( sapply( refill, is.null ), arr.ind=TRUE ) ) ]
refill - as.data.frame( refill )

do what you want? If you need to keep the nulls, perhaps don't overwrite the 
refill list?
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

jeff6868 geoffrey_kl...@etu.u-bourgogne.fr wrote:

Ok Jeff, but then it'll be a big one. I'm working on a list of files
and my
problem depends on different functions used previously. So it's very
hard
for me to summarize to reproduct my error. But here is the
reproductible
example with the error at the last line of the code (just copy and
paste
it).
You'll notice that the data.frame with only NAs is set to NULL in
refill,
and I just want to have it unchanged in output (so the same as input).
The aim of the function is to fill the NAs of my data.frames. It'll not
work
in this example because there're only big NA gaps which are my problem
for
the moment. But maybe now you can have an idea where the problem is
(change
NULL for only NA DF in output to the same DF as in input).
For the example, we are just testing for x1.
Hope you have understood my problem now :)
Thanks Jeff, Rui or everyone else!

# my data for example
DF1 - data.frame(x1=rnorm(1:20),x2=c(31:50))
write.table(DF1,ST001_2008.csv,sep=;)
DF2 -
data.frame(x1=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,rnorm(1:10)),x2=c(1:20))
write.table(DF2,ST002_2008.csv,sep=;)
DF3 - data.frame(x1=rnorm(81:100),x2=NA)
write.table(DF3,ST003_2008.csv,sep=;)
DF4 - data.frame(x1=c(21:40),x2=rnorm(1:20))
write.table(DF4,ST004_2008.csv,sep=;)

#list my data
filenames - list.files(pattern=\\_2008.csv$)

Sensors - paste(x, 1:2,sep=)

Stations -substr(filenames,1,5)

nsensors - length(Sensors)
nstations - length(Stations)

nobs - nrow(read.table(filenames[1], header=TRUE))

yr2008 - array(NA, dim=c(nobs, nsensors, nstations))

for(i in seq_len(nstations)){
tmp - read.table(filenames[i], header=TRUE, sep=;)
yr2008[ , , i] - as.matrix(tmp[, Sensors])
}

dimnames(yr2008) - list(seq.int(nobs), Sensors, Stations)

yr2008capt1hiver-yr2008[1:10,1,]
yr2008capt1hiver - as.data.frame(yr2008capt1hiver)

#correlation between my data for x1 (for the example)
 corhiver2008capt1 - cor(yr2008capt1hiver,use=pairwise.complete.obs)

capt1hiver - c(1:length(yr2008capt1hiver))

for(i in 1:length(capt1hiver))
{
   
if(sum(!is.na(yr2008capt1hiver[,capt1hiver[i]]))(length(yr2008capt1hiver[[capt1hiver[i]]])/2))
{
 corhiver2008capt1[i,]=NA
 corhiver2008capt1[,i]=NA
  }
}


  lst - lapply(list.files(pattern=\\_2008.csv$), read.table,sep=;,
header=TRUE, stringsAsFactors=FALSE)
names(lst) - Stations

# searching the highest correlation for each data.Frame
get.max.cor - function(station, mat){
 mat[row(mat) == col(mat)] - -Inf
 m - max(mat[station, ],na.rm=TRUE)
 if (is.finite(m)) {return(which( mat[station, ] == m ))}
 else {return(NA)}
}

# fill the data.frame with the data.frame which has the highest
correlation coefficient
na.fill - function(x, y){
 if(all(!is.finite(y[1:10,1])))  return(y)
 i - is.na(x[1:10,1])
 xx - y[1:10,1]
 new - data.frame(xx=xx)
 x[1:10,1][i] - predict(lm(x[1:10,1]~xx, na.action=na.exclude),new)[i]
 x
}

process.all - function(df.list, mat){

f - function(station)
   na.fill(df.list[[ station ]], df.list[[ max.cor[station] ]])

g - function(station){
x - df.list[[station]]
if(any(!is.finite(x[1:10,1]))){
mat[row(mat) == col(mat)] - -Inf
nas - which(is.na(x[1:10,1]))
ord - order(mat[station, ], decreasing = TRUE)[-c(1,
ncol(mat))]
for(y in ord){
if(all(!is.na(df.list[[y]][1:10,1][nas]))){
xx - df.list[[y]][1:10,1]
new - data.frame(xx=xx)
x[1:10,1][nas] - predict(lm(x[1:10,1]~xx,
na.action=na.exclude), new)[nas]
break
}
}
}
x
}

n - length(df.list)
nms - names(df.list)
max.cor - sapply(seq.int(n), get.max.cor, corhiver2008capt1)
df.list - lapply(seq.int(n), f)
df.list - lapply(seq.int(n), g)
names(df.list) - nms
df.list
}


Re: [R] how to ignore NA with NA or NULL

2012-06-06 Thread jeff6868
Thanks again for your help jeff.
Sorry if I'm not very clear. It's programmingly speaking hard to explain,
and even to explain in english as I'm French.
But i'll try again.

Well your proposition removes the error, but it's not the result I'm
expecting. You've removed NULL data.frames, but I need to keep them, well
not to keep them but to transform them to something non-NULL actually.

I'll try to show you in a very small and fake exemple what I want results to
be:
Imagine these are my 3 input data frames (10 rows each):
ST1 - data.frame(x1=c(1:10))
ST2 - data.frame(x2=c(1:5,NA,NA,8:10))
ST3 - data.frame(x3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))

The aim of my code is to fill all the NA of my data.frames with data,
according to the correlation coefficient  of my data.frames(for example, if
there're NAs in ST1, ST1 must be filled with data from the best correlated
file with ST1 (between ST2 and ST3 in this example)).

As ST3 has no data, I cannot have any correlation coefficient. So NAs from
ST3 cannot be filled, and ST3 cannot also be used to fill another file. So
ST3 has no use if you want. Nevertheless I want to keep ST3 unchanged during
all my code.
For the moment my code would give for refill this (filled NA in my
data.frames):

ST1 - data.frame(x1=c(1:10))
ST2 - data.frame(x2=c(1:5,6,7,8:10))
ST3 - NULL

But actually, I want for results in refill this: 

ST1 - data.frame(x1=c(1:10))
ST2 - data.frame(x2=c(1:5,6,7,8:10))
ST3 - data.frame(x3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))

So for data.frames with only NAs, I don't want them to be NULL in refill,
but I want them to be identical as in input. I need this to have the same
dimensions of data.frames between inputs and outputs.
If I set them as NULL (like it is for the moment but I don't understand why
and I want to change this), there will be 0 rows in this data.frame instead
of 10 rows like the other data.frames. 

So I think there's something wrong in my code in function process.all or
na.fill or maybe lst.
We don't seem to be far from the solution but I still don't find it for the
moment.
For information, in function process.all and na.fill: x is the
data.frame I want to fill, and y is the file which will be used to fill x
(so the best correlated file with x).

I really hope I've been enoughly clear and understandable this time.
Thank you!



--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287p4632546.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-06 Thread Rui Barradas

Hello,

Why don't you test an all(is.na(x)) condition? If TRUE, return(NA), not 
NULL.


Rui Barradas

Em 06-06-2012 16:42, jeff6868 escreveu:

Thanks again for your help jeff.
Sorry if I'm not very clear. It's programmingly speaking hard to explain,
and even to explain in english as I'm French.
But i'll try again.

Well your proposition removes the error, but it's not the result I'm
expecting. You've removed NULL data.frames, but I need to keep them, well
not to keep them but to transform them to something non-NULL actually.

I'll try to show you in a very small and fake exemple what I want results to
be:
Imagine these are my 3 input data frames (10 rows each):
ST1 - data.frame(x1=c(1:10))
ST2 - data.frame(x2=c(1:5,NA,NA,8:10))
ST3 - data.frame(x3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))

The aim of my code is to fill all the NA of my data.frames with data,
according to the correlation coefficient  of my data.frames(for example, if
there're NAs in ST1, ST1 must be filled with data from the best correlated
file with ST1 (between ST2 and ST3 in this example)).

As ST3 has no data, I cannot have any correlation coefficient. So NAs from
ST3 cannot be filled, and ST3 cannot also be used to fill another file. So
ST3 has no use if you want. Nevertheless I want to keep ST3 unchanged during
all my code.
For the moment my code would give for refill this (filled NA in my
data.frames):

ST1 - data.frame(x1=c(1:10))
ST2 - data.frame(x2=c(1:5,6,7,8:10))
ST3 - NULL

But actually, I want for results in refill this:

ST1 - data.frame(x1=c(1:10))
ST2 - data.frame(x2=c(1:5,6,7,8:10))
ST3 - data.frame(x3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))

So for data.frames with only NAs, I don't want them to be NULL in refill,
but I want them to be identical as in input. I need this to have the same
dimensions of data.frames between inputs and outputs.
If I set them as NULL (like it is for the moment but I don't understand why
and I want to change this), there will be 0 rows in this data.frame instead
of 10 rows like the other data.frames.

So I think there's something wrong in my code in function process.all or
na.fill or maybe lst.
We don't seem to be far from the solution but I still don't find it for the
moment.
For information, in function process.all and na.fill: x is the
data.frame I want to fill, and y is the file which will be used to fill x
(so the best correlated file with x).

I really hope I've been enoughly clear and understandable this time.
Thank you!



--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287p4632546.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-05 Thread jeff6868
Thanks again but my errors are still here. Is it maybe coming from the next
fonction (I combinate these 2 functions but I thought it was coming from the
first one):

process.all - function(df.list, mat){

f - function(station)
 na.fill(df.list[[ station ]], df.list[[ max.cor[station] ]])
 
g - function(station){
x - df.list[[station]]
if(any(is.na(x[1:8700,1]))){
mat[row(mat) == col(mat)] - -Inf
nas - which(is.na(x[1:8700,1]))
ord - order(mat[station, ], decreasing = TRUE)[-c(1,
ncol(mat))]
for(y in ord){   
if(all(!is.na(df.list[[y]][1:8700,1][nas]))){
xx - df.list[[y]][1:8700,1]
new - data.frame(xx=xx)
x[1:8700,1][nas] - predict(lm(x[1:8700,1]~xx,
na.action=na.exclude), new)[nas]
break
}
}
}
x
} 

n - length(df.list)
nms - names(df.list)
max.cor - sapply(seq.int(n), get.max.cor, corhiver2008capt1)
df.list - lapply(seq.int(n), f)
df.list - lapply(seq.int(n), g)
names(df.list) - nms
df.list
}

refill - process.all(lst, corhiver2008capt1)
refill - as.data.frame(refill)

The error is when refill is created. It applies process.all in which
na.fill is also used. Do you see perhaps any error or missing code which
could create this NA problem when I introduce only NAs files?

--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287p4632388.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-05 Thread Rui Barradas

Hello,

I believe the error is in function 'g'. If I'm right, follow these steps

1. Just before the first if include
flag - TRUE
2. Just before for(y in ord) include
flag - FALSE
3. Just before break include
flag - TRUE
3. Change the return value form simply x to
if(flag) x else NA


The code loops through the ordered matrix until it finds no NAs in the 
respective df.list element. Nothing guarantees that there are such list 
elements. The changes above check it by setting a flag.


Rui Barradas

Em 05-06-2012 10:54, jeff6868 escreveu:

Thanks again but my errors are still here. Is it maybe coming from the next
fonction (I combinate these 2 functions but I thought it was coming from the
first one):

process.all- function(df.list, mat){

 f- function(station)
  na.fill(df.list[[ station ]], df.list[[ max.cor[station] ]])

 g- function(station){
 x- df.list[[station]]
 if(any(is.na(x[1:8700,1]))){
 mat[row(mat) == col(mat)]- -Inf
 nas- which(is.na(x[1:8700,1]))
 ord- order(mat[station, ], decreasing = TRUE)[-c(1,
ncol(mat))]
 for(y in ord){
 if(all(!is.na(df.list[[y]][1:8700,1][nas]))){
 xx- df.list[[y]][1:8700,1]
 new- data.frame(xx=xx)
 x[1:8700,1][nas]- predict(lm(x[1:8700,1]~xx,
na.action=na.exclude), new)[nas]
 break
 }
 }
 }
 x
 }

 n- length(df.list)
 nms- names(df.list)
 max.cor- sapply(seq.int(n), get.max.cor, corhiver2008capt1)
 df.list- lapply(seq.int(n), f)
 df.list- lapply(seq.int(n), g)
 names(df.list)- nms
 df.list
 }

 refill- process.all(lst, corhiver2008capt1)
 refill- as.data.frame(refill)

The error is when refill is created. It applies process.all in which
na.fill is also used. Do you see perhaps any error or missing code which
could create this NA problem when I introduce only NAs files?

--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287p4632388.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to ignore NA with NA or NULL

2012-06-04 Thread jeff6868
Hello dear R-users,

I have a problem in my code about ignoring NA values without removing them.
I'm working on a list of files. The aim is to fill one file from another
according to the highest correlation (correlation coeff between all my
files, so the file which looks like the most to the one I want to fill).
When I have just small gaps of NA, my function works well.
The problem is when I have only NAs in some files. As a consequence, it
cannot calculate any correlation coefficients (my previous function in the
case of only NAs in the file returns NA for the correlation coefficient),
and so it cannot fill it or make any calculation with it.

Nevertheless in my work I need to keep these NA files in my list (and so to
keep their dimensions). Otherwise it creates some dimensions problems, and
my function needs to me automatic for every files.

So my question in this post is: how to ignore (or do nothing with them if
you prefer) NA files with NA correlation coefficients?
The function for filling files (where there's the problem) is:

na.fill - function(x, y){
i - is.na(x[1:8700,1])
xx - y[1:8700,1] 
new - data.frame(xx=xx)   
x[1:8700,1][i] - predict(lm(x[1:8700,1]~xx, na.action=na.exclude),
new)[i]
x
}

My error message is: Error in model.frame.default(formula = x[1:8700, 1] ~
xx, na.action = na.exclude,  :  : invalid type (NULL) for variable 'xx'

I tried to add in the function:  
ifelse( all(is.null(xx))==TRUE,return(NA),xx)  or
ifelse( all(is.null(xx))==TRUE,return(NULL),xx)

but it still doesn't work.
How can I write that in my function? With NA, NULL or in another way?
Thank you very much for your answers


--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-04 Thread Jeff Newmiller
I find that avoiding using the return() function at all makes my code easier to 
follow. In your case it is simply incorrect, though, since ifelse is a vector 
function and return is a control flow function.

Your code is not reproducible and your description isn't clear about how you 
are handling the return result from this function, so I can't be sure what you 
are really asking, but I suspect you just want flow control, so use (untested):

na.fill - function(x, y){
  i - is.na(x[1:8700,1])
  xx - y[1:8700,1] 
  new - data.frame(xx=xx)
  if ( !all(is.na(xx)) ) { 
   x[1:8700,1][i] - predict(lm(x[1:8700,1]~xx, na.action=na.exclude),new)[i]
  }
  x
}
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

jeff6868 geoffrey_kl...@etu.u-bourgogne.fr wrote:

Hello dear R-users,

I have a problem in my code about ignoring NA values without removing
them.
I'm working on a list of files. The aim is to fill one file from
another
according to the highest correlation (correlation coeff between all my
files, so the file which looks like the most to the one I want to
fill).
When I have just small gaps of NA, my function works well.
The problem is when I have only NAs in some files. As a consequence, it
cannot calculate any correlation coefficients (my previous function in
the
case of only NAs in the file returns NA for the correlation
coefficient),
and so it cannot fill it or make any calculation with it.

Nevertheless in my work I need to keep these NA files in my list (and
so to
keep their dimensions). Otherwise it creates some dimensions problems,
and
my function needs to me automatic for every files.

So my question in this post is: how to ignore (or do nothing with them
if
you prefer) NA files with NA correlation coefficients?
The function for filling files (where there's the problem) is:

na.fill - function(x, y){
i - is.na(x[1:8700,1])
xx - y[1:8700,1] 
new - data.frame(xx=xx)   
x[1:8700,1][i] - predict(lm(x[1:8700,1]~xx, na.action=na.exclude),
new)[i]
x
}

My error message is: Error in model.frame.default(formula = x[1:8700,
1] ~
xx, na.action = na.exclude,  :  : invalid type (NULL) for variable 'xx'

I tried to add in the function:  
ifelse( all(is.null(xx))==TRUE,return(NA),xx)  or
ifelse( all(is.null(xx))==TRUE,return(NULL),xx)

but it still doesn't work.
How can I write that in my function? With NA, NULL or in another way?
Thank you very much for your answers


--
View this message in context:
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-04 Thread Rui Barradas

Hello,

'ifelse' is vectorized, what you want is the plain 'if'.

if(all(is.na(xx))) return(NA)

Hope this helps,

Rui Barradas

Em 04-06-2012 09:56, jeff6868 escreveu:

Hello dear R-users,

I have a problem in my code about ignoring NA values without removing them.
I'm working on a list of files. The aim is to fill one file from another
according to the highest correlation (correlation coeff between all my
files, so the file which looks like the most to the one I want to fill).
When I have just small gaps of NA, my function works well.
The problem is when I have only NAs in some files. As a consequence, it
cannot calculate any correlation coefficients (my previous function in the
case of only NAs in the file returns NA for the correlation coefficient),
and so it cannot fill it or make any calculation with it.

Nevertheless in my work I need to keep these NA files in my list (and so to
keep their dimensions). Otherwise it creates some dimensions problems, and
my function needs to me automatic for every files.

So my question in this post is: how to ignore (or do nothing with them if
you prefer) NA files with NA correlation coefficients?
The function for filling files (where there's the problem) is:

na.fill- function(x, y){
 i- is.na(x[1:8700,1])
 xx- y[1:8700,1]
 new- data.frame(xx=xx)
 x[1:8700,1][i]- predict(lm(x[1:8700,1]~xx, na.action=na.exclude),
new)[i]
 x
 }

My error message is: Error in model.frame.default(formula = x[1:8700, 1] ~
xx, na.action = na.exclude,  :  : invalid type (NULL) for variable 'xx'

I tried to add in the function:
ifelse( all(is.null(xx))==TRUE,return(NA),xx)  or
ifelse( all(is.null(xx))==TRUE,return(NULL),xx)

but it still doesn't work.
How can I write that in my function? With NA, NULL or in another way?
Thank you very much for your answers


--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-04 Thread jeff6868
Thanks for answering Jeff.
Yes sorry it's not easy to explain my problem. I'll try to give you a
reproductible example (even if it'll not be exactly like my files), and I'll
try to explain my function and what I want to do more precisely.

Imagine for the example: df1, df2 and df3 are my files:
df1 - data.frame(x1=c(rnorm(1:5),NA,NA,rnorm(8:10)))
df2 - data.frame(x2=rnorm(1:10))
df3 - data.frame(x3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA))
df - list(df1,df2,df3)

I want to fill each NA gaps of my files. If I have only df1 and df2 in my
list, it'll work. If I introduce df3 (a file with only NAs), R won't
understand what to do.

In my function:

na.fill - function(x, y){
i - is.na(x[1:10,1])
xx - y[1:10,1]
new - data.frame(xx=xx)
x[1:10,1][i] - predict(lm(x[1:10,1]~xx, na.action=na.exclude),
new)[i]
x
}

x is the file I want to fill. So i lists all the NA gaps of the file.
xx is the file that will be used to fill x (actually the best correlated
file with x according to all my files).
And then I apply a linear regression between my 2 files: x and xx to
take predicted values from xx to put in the gaps of x.

Before I got files containing only NAs, it was working well. But since I
introduced some files with no data and so only NAs, I have my problem.
I got different NA problems when I tried a few solutions:
Error in model.frame.default(formula = x[1:8700,1] ~xx, na.action =
na.exclude,  :  : invalid type (NULL) for variable 'xx' OR
0 (non-NA) cases OR
is.na() applied to non-(list or vector) of type 'NULL

Actually I'm looking for a solution in na.fill to avoid these problems, in
order to ignore these only NA files from the calculation (maybe something
like na.pass) but I would like to keep them in the list. So the aim would be
maybe to keep them unchanged (if I have for example ST1 file with 30 only NA
in input, I want to have ST1 file with 30 only NA in output) but calculation
should work with these kinds of files in my list even if the code does
nothing with them.

Hope you've understood. Thanks again for your help.

--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287p4632314.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-04 Thread jeff6868
Hello Rui,

Sorry I read your post after having answered to jeff.

If seems effectively to be better than ifelse, thanks. But I still have some
errors:
Error in x[1:8700, 1] : incorrect number of dimensions AND
In is.na(xx) : is.na() applied to non-(list or vector) of type 'NULL

It seems to have modified the length of my data, due to these NAs

--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287p4632315.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to ignore NA with NA or NULL

2012-06-04 Thread Rui Barradas

Hello again,

The complete function would be

na.fill - function(x, y){
# do this immediatly, may save copying
if(all(is.na(y[1:8700,1]))) return(NA)
i - is.na(x[1:8700,1])
xx - y[1:8700,1]
new - data.frame(xx=xx)
x[1:8700,1][i] - predict(lm(x[1:8700,1]~xx, na.action=na.exclude), 
new)[i]

x
}

Rui Barradas

Em 04-06-2012 16:05, Rui Barradas escreveu:

Hello,

'ifelse' is vectorized, what you want is the plain 'if'.

if(all(is.na(xx))) return(NA)

Hope this helps,

Rui Barradas

Em 04-06-2012 09:56, jeff6868 escreveu:

Hello dear R-users,

I have a problem in my code about ignoring NA values without removing 
them.

I'm working on a list of files. The aim is to fill one file from another
according to the highest correlation (correlation coeff between all my
files, so the file which looks like the most to the one I want to fill).
When I have just small gaps of NA, my function works well.
The problem is when I have only NAs in some files. As a consequence, it
cannot calculate any correlation coefficients (my previous function 
in the
case of only NAs in the file returns NA for the correlation 
coefficient),

and so it cannot fill it or make any calculation with it.

Nevertheless in my work I need to keep these NA files in my list (and 
so to
keep their dimensions). Otherwise it creates some dimensions 
problems, and

my function needs to me automatic for every files.

So my question in this post is: how to ignore (or do nothing with 
them if

you prefer) NA files with NA correlation coefficients?
The function for filling files (where there's the problem) is:

na.fill- function(x, y){
 i- is.na(x[1:8700,1])
 xx- y[1:8700,1]
 new- data.frame(xx=xx)
 x[1:8700,1][i]- predict(lm(x[1:8700,1]~xx, 
na.action=na.exclude),

new)[i]
 x
 }

My error message is: Error in model.frame.default(formula = x[1:8700, 
1] ~

xx, na.action = na.exclude,  :  : invalid type (NULL) for variable 'xx'

I tried to add in the function:
ifelse( all(is.null(xx))==TRUE,return(NA),xx)  or
ifelse( all(is.null(xx))==TRUE,return(NULL),xx)

but it still doesn't work.
How can I write that in my function? With NA, NULL or in another way?
Thank you very much for your answers


--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-ignore-NA-with-NA-or-NULL-tp4632287.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.