Re: [R] How to add error bars to lattice xyplot

2021-10-11 Thread Luigi Marongiu
Thank you, but on the example in use, it draws at each x a triplet of
y from the same class linked by a segment. It is essentially a strip
plot rather than a scatter plot...

On Mon, Oct 11, 2021 at 10:28 PM Bert Gunter  wrote:
>
> Your panel function needs to plot the points! See at  below
>
> xyplot(Value ~ Concentration,
>group = Substance, data = df,
>pch = 16, cex = 1.2, type = "b",
>xlab=expression(bold(paste("Concentration (", mu, "M)"))),
>ylab=expression(bold("Infection rate")),
>col=COLS,
>scales = list(x = list(log = 10, at=c(unique(df$Concentration))
>)
>),
>key = list(space="top", columns=4, col = "black",
>   points=list(pch=c(16, 16, 16, 16),
>   col=COLS
>   ),
>   text=list(c("A", "B", "C", "D")
>   )
>),
>panel = function (x,y,...) {
>   panel.xyplot(x,y, ...)  ###
>   panel.segments(x0 = log10(df$Concentration),
>  x1 = log10(df$Concentration),
>  y0 = df$Value - dfsd$Value,
>  y1 = df$Value + dfsd$Value,
>  col = COLS)
>}
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and 
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Mon, Oct 11, 2021 at 12:24 PM Luigi Marongiu  
> wrote:
>>
>> Thanks,
>> now I got the bars (although without notch) but I lost the main plot:
>> ```
>> xyplot(Value ~ Concentration,
>> group = Substance, data = df,
>> pch = 16, cex = 1.2, type = "b",
>> xlab=expression(bold(paste("Concentration (", mu, "M)"))),
>> ylab=expression(bold("Infection rate")),
>> col=COLS,
>> scales = list(x = list(log = 10, at=c(unique(df$Concentration))
>> )
>> ),
>> key = list(space="top", columns=4, col = "black",
>> points=list(pch=c(16, 16, 16, 16),
>> col=COLS
>> ),
>> text=list(c("A", "B", "C", "D")
>> )
>> ),
>> panel = function (x,y) {
>> panel.segments(x0 = log10(df$Concentration),
>> x1 = log10(df$Concentration),
>> y0 = df$Value - dfsd$Value,
>> y1 = df$Value + dfsd$Value,
>> col = COLS)
>> }
>>
>> )
>> ```
>> I will check xYplot out, I think it is the tool for the job.
>>
>> On Mon, Oct 11, 2021 at 3:56 PM Deepayan Sarkar
>>  wrote:
>> >
>> > On Mon, Oct 11, 2021 at 5:41 PM Luigi Marongiu  
>> > wrote:
>> > >
>> > > Hello,
>> > > I am trying to plot data using lattice. The basic plot works:
>> > > ```
>> > > Substance = rep(c("A", "B", "C", "D"),4)
>> > > Concentration = rep(1:4,4),
>> > > Value = c(62.8067, 116.2633,  92.2600,   9.8733,  
>> > > 14.8233,
>> > >   92.3733, 98.9567,   1.4833,   0.6467,  
>> > > 50.6600,
>> > >   25.7533,   0.6900, 0.2167,   7.4067,   
>> > > 6.9200,
>> > >   0.0633)
>> > > df = data.frame(Substance, Concentration, Value, stringsAsFactors = 
>> > > FALSE)
>> > > Value = c(15.2974126, 16.3196089, 57.4294280,  9.1943370, 20.5567321,
>> > > 14.0874424,
>> > >38.3626672, 0.3780653,  0.4738495, 37.9124874, 16.2473916,  
>> > > 0.7218726,
>> > >0.2498666,  8.4537585, 10.8058456,  0.1096966)
>> > > dfsd = data.frame(Substance, Concentration, Value, stringsAsFactors = 
>> > > FALSE)
>> > >
>> > > library(lattice)
>> > > COLS = c("gold", "forestgreen", "darkslategray3", "purple")
>> > > xyplot(Value ~ Concentration,
>> > >group = Substance, data = df,
>> > >pch = 16, cex = 1.2, type = "b",
>> > >xlab=expression(bold(paste("Concentration (", mu, "M)"))),
>> > >ylab=expression(bold("Infection rate")),
>> > >col=COLS,
>> > >scales = list(x = list(log = 10, at=c(unique(df$Concentration))
>> > >)
>> > >),
>> > >key = list(space="top", columns=4, col = "black",
>> > >   points=list(pch=c(16, 16, 16, 16),
>> > >   col=COLS,
>> > >   text=list(c("6-PN", "8-PN", "IX", "XN")
>> > >   )
>> > >   )
>> > >)
>> > >
>> > > )
>> > > ```
>> > > but how do I add the error bars?
>> > > I tried with
>> > > ```
>> > > xyplot(Value ~ Concentration,
>> > >group = Substance, data = df,
>> > >pch = 16, cex = 1.2, type = "b",
>> > >xlab=expression(bold(paste("Concentration (", mu, "M)"))),
>> > >ylab=expression(bold("Infection rate")),
>> > >col=COLS,
>> > >scales = list(x = list(log = 10, at=c(unique(df$Concentration))
>> > >)
>> > >),
>> > >key = list(space="top", columns=4, col = "black",
>> > >   points=list(pch=c(16, 16, 16, 16),
>> > >   col=COLS,
>> > >   text=list(c("6-PN", "8-PN", "IX", "XN")
>> > > 

Re: [R] assumptions about how things are done

2021-10-11 Thread Avi Gross via R-help
I appreciate the feedback from several people. As noted, I do not want a deep 
philosophical discussion here and my main point remains not to expect software 
you "borrow" to do what you WANT but to accommodate it doing what it should.

Most of the time, I would say that a I want a vectorized function to do things 
exactly the way ifelse() does it. A mean A+B vectorized adds corresponding 
entries in any order and perhaps even using multiple cores to do it 
concurrently for larger vectors. Bu we normally don't care about the details, 
just the result. We want ALL the conditions evaluated and all the then and else 
parts and the appropriate ones combined into a result. Other than some storage 
considerations, and maybe efficiency considerations, it matters little if it is 
done in a gradual loop way or some other.

My point is some are used to loops and if they assume everything will not be 
evaluated, may have problems if that is a problem. Your example about division 
by zero is an example where you might change your code to avoid it. One way is 
to compute the then or else vector before calling ifelse() on the result and 
doing that calculation carefully as in a way that tests for dividing by zero 
and does something appropriate to avoid it or trap the error or something. 
Another is the wrapper method I mentioned, And, if you really need not to 
evaluate some things such as to avoid side effects, ifelse() may not be what 
you use then.

I was wondering if I do depend on the times R does non-standard evaluation too 
much that I think it can be done anytime. Obviously, not. I have often seen 
anomalous results when I forgot that changing something multiple times that 
gets evaluated ONCE later, does not work even if my intent was say to make 
lines that are dashed then others that are dotted and so on.

Many other languages I use do not have this gimmick and it can be annoying but 
realistic to have to pass some things explicitly such as a text version of the 
formula used so it can be displayed in the result. Most functions only see the 
result of arguments passed after evaluation.

So a strength of R can also be ...

-Original Message-
From: R-help  On Behalf Of Jorgen Harmse via 
R-help
Sent: Monday, October 11, 2021 11:08 AM
To: r-help@r-project.org
Subject: Re: [R] assumptions about how things are done

As noted by Richard O'Keefe, what usually happens in an R function is that any 
argument is evaluated either in its entirety or not at all. A few functions use 
substitute or similar trickery, but then expectations should be documented. I 
can understand that you want something like ifelse(y>x,x/y,z) to run without 
warning about division by zero, but how would that be implemented in general? 
Even a subexpression as simple as f(a,b) presents a problem: you want 
f(a,b)[cond], but you don't know how the function f works. It might be just a 
vector operation (and then perhaps f(a[cond],b[cond]) is what we want), or it 
might return a+rev(b). Avi Gross correctly notes that the implementation is not 
what he wants, but I think that what he wants is possible only in special cases.

Regards,
Jorgen Harmse. 



Message: 2
Date: Sat, 9 Oct 2021 15:35:55 -0400
From: "Avi Gross" 
To: 
Subject: [R] assumptions about how things are done
Message-ID: <029401d7bd44$e10843c0$a318cb40$@verizon.net>
Content-Type: text/plain; charset="utf-8"

This is supposed to be a forum for help so general and philosophical
discussions belong elsewhere, or nowhere.



Having said that, I want to make a brief point. Both new and experienced
people make implicit assumptions about the code they use. Often nobody looks
at how the sausage is made. The recent discussion of ifelse() made me take a
look and I was not thrilled.



My NA VE view was that ifelse() was implemented as a sort of loop construct.
I mean if I have a vector of length N and perhaps a few other vectors of the
same length, I might say:



result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
result-if-false-using-vectors)



So say I want to take a vector of integers from 1 to N and make an output a
second vector where you have either a prime number or NA. If I have a
function called is.prime() that checks a single number and returns
TRUE/FALSE, it might look like this:



primed <- ifelse(is.prime(A, A, NA)



So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
composite becomes NA and so on.



If you wrote the above using loops, it would be to range from index 1 to N
and apply the above. There are many complications as R allows vectors to be
longer or to be repeated as needed.



What I found ifelse() as implemented to do, is sort of like this:



Make a vector of the right length for the results, initially empty.



Make a vector evaluating the condition so it is effectively a Boolean
result.

  

Re: [R] How to add error bars to lattice xyplot

2021-10-11 Thread Bert Gunter
Your panel function needs to plot the points! See at  below

xyplot(Value ~ Concentration,
   group = Substance, data = df,
   pch = 16, cex = 1.2, type = "b",
   xlab=expression(bold(paste("Concentration (", mu, "M)"))),
   ylab=expression(bold("Infection rate")),
   col=COLS,
   scales = list(x = list(log = 10, at=c(unique(df$Concentration))
   )
   ),
   key = list(space="top", columns=4, col = "black",
  points=list(pch=c(16, 16, 16, 16),
  col=COLS
  ),
  text=list(c("A", "B", "C", "D")
  )
   ),
   panel = function (x,y,...) {
  panel.xyplot(x,y, ...)  ###
  panel.segments(x0 = log10(df$Concentration),
 x1 = log10(df$Concentration),
 y0 = df$Value - dfsd$Value,
 y1 = df$Value + dfsd$Value,
 col = COLS)
   }



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Oct 11, 2021 at 12:24 PM Luigi Marongiu 
wrote:

> Thanks,
> now I got the bars (although without notch) but I lost the main plot:
> ```
> xyplot(Value ~ Concentration,
> group = Substance, data = df,
> pch = 16, cex = 1.2, type = "b",
> xlab=expression(bold(paste("Concentration (", mu, "M)"))),
> ylab=expression(bold("Infection rate")),
> col=COLS,
> scales = list(x = list(log = 10, at=c(unique(df$Concentration))
> )
> ),
> key = list(space="top", columns=4, col = "black",
> points=list(pch=c(16, 16, 16, 16),
> col=COLS
> ),
> text=list(c("A", "B", "C", "D")
> )
> ),
> panel = function (x,y) {
> panel.segments(x0 = log10(df$Concentration),
> x1 = log10(df$Concentration),
> y0 = df$Value - dfsd$Value,
> y1 = df$Value + dfsd$Value,
> col = COLS)
> }
>
> )
> ```
> I will check xYplot out, I think it is the tool for the job.
>
> On Mon, Oct 11, 2021 at 3:56 PM Deepayan Sarkar
>  wrote:
> >
> > On Mon, Oct 11, 2021 at 5:41 PM Luigi Marongiu 
> wrote:
> > >
> > > Hello,
> > > I am trying to plot data using lattice. The basic plot works:
> > > ```
> > > Substance = rep(c("A", "B", "C", "D"),4)
> > > Concentration = rep(1:4,4),
> > > Value = c(62.8067, 116.2633,  92.2600,   9.8733,
> 14.8233,
> > >   92.3733, 98.9567,   1.4833,   0.6467,
> 50.6600,
> > >   25.7533,   0.6900, 0.2167,   7.4067,
>  6.9200,
> > >   0.0633)
> > > df = data.frame(Substance, Concentration, Value, stringsAsFactors =
> FALSE)
> > > Value = c(15.2974126, 16.3196089, 57.4294280,  9.1943370, 20.5567321,
> > > 14.0874424,
> > >38.3626672, 0.3780653,  0.4738495, 37.9124874, 16.2473916,
> 0.7218726,
> > >0.2498666,  8.4537585, 10.8058456,  0.1096966)
> > > dfsd = data.frame(Substance, Concentration, Value, stringsAsFactors =
> FALSE)
> > >
> > > library(lattice)
> > > COLS = c("gold", "forestgreen", "darkslategray3", "purple")
> > > xyplot(Value ~ Concentration,
> > >group = Substance, data = df,
> > >pch = 16, cex = 1.2, type = "b",
> > >xlab=expression(bold(paste("Concentration (", mu, "M)"))),
> > >ylab=expression(bold("Infection rate")),
> > >col=COLS,
> > >scales = list(x = list(log = 10, at=c(unique(df$Concentration))
> > >)
> > >),
> > >key = list(space="top", columns=4, col = "black",
> > >   points=list(pch=c(16, 16, 16, 16),
> > >   col=COLS,
> > >   text=list(c("6-PN", "8-PN", "IX", "XN")
> > >   )
> > >   )
> > >)
> > >
> > > )
> > > ```
> > > but how do I add the error bars?
> > > I tried with
> > > ```
> > > xyplot(Value ~ Concentration,
> > >group = Substance, data = df,
> > >pch = 16, cex = 1.2, type = "b",
> > >xlab=expression(bold(paste("Concentration (", mu, "M)"))),
> > >ylab=expression(bold("Infection rate")),
> > >col=COLS,
> > >scales = list(x = list(log = 10, at=c(unique(df$Concentration))
> > >)
> > >),
> > >key = list(space="top", columns=4, col = "black",
> > >   points=list(pch=c(16, 16, 16, 16),
> > >   col=COLS,
> > >   text=list(c("6-PN", "8-PN", "IX", "XN")
> > >   )
> > >   )
> > >),
> > >panel = function (x,y,) {
> > >  panel.segments(x0 = df$Concentration, x1 = df$Concentration,
> > > y0 = df$Value - dfsd$Value,
> > > y1 = df$Value + dfsd$Value,
> > > col = COLS)
> > >}
> > >
> > > )
> > > ```
> > > but the bars are plotted outside the graph.
> >
> > You need to a

Re: [R] unexpected behavior in apply

2021-10-11 Thread Rolf Turner
On Mon, 11 Oct 2021 09:15:27 +
PIKAL Petr  wrote:



> 
> data.frame is not matrix or array (even if it rather resembles one)
> 
> So if you put a cake into oven you cannot expect getting fried
> potatoes from it.



Another fortune nomination!

cheers,

Rolf

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to add error bars to lattice xyplot

2021-10-11 Thread Luigi Marongiu
Thanks,
now I got the bars (although without notch) but I lost the main plot:
```
xyplot(Value ~ Concentration,
group = Substance, data = df,
pch = 16, cex = 1.2, type = "b",
xlab=expression(bold(paste("Concentration (", mu, "M)"))),
ylab=expression(bold("Infection rate")),
col=COLS,
scales = list(x = list(log = 10, at=c(unique(df$Concentration))
)
),
key = list(space="top", columns=4, col = "black",
points=list(pch=c(16, 16, 16, 16),
col=COLS
),
text=list(c("A", "B", "C", "D")
)
),
panel = function (x,y) {
panel.segments(x0 = log10(df$Concentration),
x1 = log10(df$Concentration),
y0 = df$Value - dfsd$Value,
y1 = df$Value + dfsd$Value,
col = COLS)
}

)
```
I will check xYplot out, I think it is the tool for the job.

On Mon, Oct 11, 2021 at 3:56 PM Deepayan Sarkar
 wrote:
>
> On Mon, Oct 11, 2021 at 5:41 PM Luigi Marongiu  
> wrote:
> >
> > Hello,
> > I am trying to plot data using lattice. The basic plot works:
> > ```
> > Substance = rep(c("A", "B", "C", "D"),4)
> > Concentration = rep(1:4,4),
> > Value = c(62.8067, 116.2633,  92.2600,   9.8733,  
> > 14.8233,
> >   92.3733, 98.9567,   1.4833,   0.6467,  
> > 50.6600,
> >   25.7533,   0.6900, 0.2167,   7.4067,   6.9200,
> >   0.0633)
> > df = data.frame(Substance, Concentration, Value, stringsAsFactors = FALSE)
> > Value = c(15.2974126, 16.3196089, 57.4294280,  9.1943370, 20.5567321,
> > 14.0874424,
> >38.3626672, 0.3780653,  0.4738495, 37.9124874, 16.2473916,  
> > 0.7218726,
> >0.2498666,  8.4537585, 10.8058456,  0.1096966)
> > dfsd = data.frame(Substance, Concentration, Value, stringsAsFactors = FALSE)
> >
> > library(lattice)
> > COLS = c("gold", "forestgreen", "darkslategray3", "purple")
> > xyplot(Value ~ Concentration,
> >group = Substance, data = df,
> >pch = 16, cex = 1.2, type = "b",
> >xlab=expression(bold(paste("Concentration (", mu, "M)"))),
> >ylab=expression(bold("Infection rate")),
> >col=COLS,
> >scales = list(x = list(log = 10, at=c(unique(df$Concentration))
> >)
> >),
> >key = list(space="top", columns=4, col = "black",
> >   points=list(pch=c(16, 16, 16, 16),
> >   col=COLS,
> >   text=list(c("6-PN", "8-PN", "IX", "XN")
> >   )
> >   )
> >)
> >
> > )
> > ```
> > but how do I add the error bars?
> > I tried with
> > ```
> > xyplot(Value ~ Concentration,
> >group = Substance, data = df,
> >pch = 16, cex = 1.2, type = "b",
> >xlab=expression(bold(paste("Concentration (", mu, "M)"))),
> >ylab=expression(bold("Infection rate")),
> >col=COLS,
> >scales = list(x = list(log = 10, at=c(unique(df$Concentration))
> >)
> >),
> >key = list(space="top", columns=4, col = "black",
> >   points=list(pch=c(16, 16, 16, 16),
> >   col=COLS,
> >   text=list(c("6-PN", "8-PN", "IX", "XN")
> >   )
> >   )
> >),
> >panel = function (x,y,) {
> >  panel.segments(x0 = df$Concentration, x1 = df$Concentration,
> > y0 = df$Value - dfsd$Value,
> > y1 = df$Value + dfsd$Value,
> > col = COLS)
> >}
> >
> > )
> > ```
> > but the bars are plotted outside the graph.
>
> You need to apply the log-transformation yourself, e.g.,
>
>  panel.segments(x0 = log10(df$Concentration), x1 =
> log10(df$Concentration),
>
> But this is not really a scalable approach. You should check if
> Hmisc::xYplot suits your needs:
>
> https://search.r-project.org/CRAN/refmans/Hmisc/html/xYplot.html
>
> Best,
> -Deepayan
>
> > What is the correct syntax? can I use raw data instead of making the
> > mean and std dev separately?
> > Thanks
> >
> > --
> > Best regards,
> > Luigi
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.



-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [External] Missing text in lattice key legend

2021-10-11 Thread Luigi Marongiu
Yes, that was the case. Now it works! Thanks

On Mon, Oct 11, 2021 at 4:22 PM Richard M. Heiberger  wrote:
>
> looks like a paren outof place.  the text is inside the points. it should be 
> parallel to the points in the calling sequence.
>
> Get Outlook for iOS
> 
> From: R-help  on behalf of Luigi Marongiu 
> 
> Sent: Monday, October 11, 2021 7:46:36 AM
> To: r-help 
> Subject: [External] [R] Missing text in lattice key legend
>
> Hello,
> I am drawing some data with lattice using:
> ```
> library(lattice)
> COLS = c("gold", "forestgreen", "darkslategray3", "purple")
> xyplot(Value ~ Concentration,
>group = Substance, data = inf_avg,
>pch = 16, cex = 1.2, type = "b",
>xlab=expression(bold(paste("Concentration (", mu, "M)"))),
>ylab=expression(bold("Infection rate")),
>col=COLS,
>scales = list(x = list(log = 10, at=c(unique(inf_avg$Concentration))
>   )
>  ),
>key = list(space="top", columns=4, col = "black",
>points=list(pch=c(16, 16, 16, 16),
>col=COLS,
>text=list(c("6-PN", "8-PN", "IX", "XN")
> )
>)
>   ),
>panel = function(x,y) {
>  panel.xyplot(x,y)
>  errbar()
>}
> )
> ```
> It all works but the legend only shows the colored dots, there is no
> text. Is it something missing from the syntax?
> Thanks
>
> --
> Best regards,
> Luigi
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=04%7C01%7Crmh%40temple.edu%7C3f303633d643499924e208d98cacdaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C637695496888670932%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=98vQNMB7OcS%2B2R73ZMngEeg%2BP6PeP3oCAOUDHxs9SU8%3D&reserved=0
> PLEASE do read the posting guide 
> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=04%7C01%7Crmh%40temple.edu%7C3f303633d643499924e208d98cacdaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C637695496888670932%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=1yzXmD57qql7UXEKNqdK8Iq1vhfkUYf%2BpX8gvfvD3p0%3D&reserved=0
> and provide commented, minimal, self-contained, reproducible code.



-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read from a file within a gzipped file

2021-10-11 Thread Rui Barradas

Hello,

You can create a connection and read from it.

desc <- "X:/Mkt Science/Projects/ tv/202109.ext/tsa.20210901.xml.zip"
fname <- "20210901.socioDemos.a.xml"
zz <- unz(desc, fname)


Now read from zz. Example:


xml <- XML::xmlParse(zz)


Hope this helps,

Rui Barradas

Às 17:24 de 11/10/21, Conklin, Mike (GfK) via R-help escreveu:


Hi have a large number of zipped files, each containing 3 xml files that I want 
to read.  I would like to read one of the xml files without having to 
decompress each zip file first.

If I run gzfile(path2zipped file) I get

A connection with
description "X:/Mkt Science/Projects/ tv/202109.ext/tsa.20210901.xml.zip"
class   "gzfile"
mode"rb"
text"text"
opened  "closed"
can read"yes"
can write   "yes"

Within that zipped file I want to read from

"20210901.socioDemos.a.xml" which is one of 3 xml files within the zip file



Is there a way to return a connection to a single file within a zipped file 
using gzfile or some other method.



--
W. Michael Conklin
Executive Vice President
Marketing & Data Sciences - North America
GfK
mike.conk...@gfk.com
M +1 612 567 8287
www.gfk.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to read from a file within a gzipped file

2021-10-11 Thread Ivan Krylov
On Mon, 11 Oct 2021 16:24:01 +
"Conklin, Mike (GfK) via R-help"  wrote:

> Is there a way to return a connection to a single file within a
> zipped file using gzfile or some other method.

Sure! Use unz() instead of gzfile().

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to read from a file within a gzipped file

2021-10-11 Thread Conklin, Mike (GfK) via R-help


Hi have a large number of zipped files, each containing 3 xml files that I want 
to read.  I would like to read one of the xml files without having to 
decompress each zip file first.

If I run gzfile(path2zipped file) I get

A connection with
description "X:/Mkt Science/Projects/ tv/202109.ext/tsa.20210901.xml.zip"
class   "gzfile"
mode"rb"
text"text"
opened  "closed"
can read"yes"
can write   "yes"

Within that zipped file I want to read from

"20210901.socioDemos.a.xml" which is one of 3 xml files within the zip file



Is there a way to return a connection to a single file within a zipped file 
using gzfile or some other method.



--
W. Michael Conklin
Executive Vice President
Marketing & Data Sciences - North America
GfK
mike.conk...@gfk.com
M +1 612 567 8287
www.gfk.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assumptions about how things are done

2021-10-11 Thread Jorgen Harmse via R-help
As noted by Richard O'Keefe, what usually happens in an R function is that any 
argument is evaluated either in its entirety or not at all. A few functions use 
substitute or similar trickery, but then expectations should be documented. I 
can understand that you want something like ifelse(y>x,x/y,z) to run without 
warning about division by zero, but how would that be implemented in general? 
Even a subexpression as simple as f(a,b) presents a problem: you want 
f(a,b)[cond], but you don't know how the function f works. It might be just a 
vector operation (and then perhaps f(a[cond],b[cond]) is what we want), or it 
might return a+rev(b). Avi Gross correctly notes that the implementation is not 
what he wants, but I think that what he wants is possible only in special cases.

Regards,
Jorgen Harmse. 



Message: 2
Date: Sat, 9 Oct 2021 15:35:55 -0400
From: "Avi Gross" 
To: 
Subject: [R] assumptions about how things are done
Message-ID: <029401d7bd44$e10843c0$a318cb40$@verizon.net>
Content-Type: text/plain; charset="utf-8"

This is supposed to be a forum for help so general and philosophical
discussions belong elsewhere, or nowhere.



Having said that, I want to make a brief point. Both new and experienced
people make implicit assumptions about the code they use. Often nobody looks
at how the sausage is made. The recent discussion of ifelse() made me take a
look and I was not thrilled.



My NA�VE view was that ifelse() was implemented as a sort of loop construct.
I mean if I have a vector of length N and perhaps a few other vectors of the
same length, I might say:



result <- ifelse(condition-on-vector-A, result-if-true-using-vectors,
result-if-false-using-vectors)



So say I want to take a vector of integers from 1 to N and make an output a
second vector where you have either a prime number or NA. If I have a
function called is.prime() that checks a single number and returns
TRUE/FALSE, it might look like this:



primed <- ifelse(is.prime(A, A, NA)



So A[1] will be mapped to 1 and A[2} to 2 and A[3] to 3, but A[4] being
composite becomes NA and so on.



If you wrote the above using loops, it would be to range from index 1 to N
and apply the above. There are many complications as R allows vectors to be
longer or to be repeated as needed.



What I found ifelse() as implemented to do, is sort of like this:



Make a vector of the right length for the results, initially empty.



Make a vector evaluating the condition so it is effectively a Boolean
result.

Calculate which indices are TRUE. Secondarily, calculate another set of
indices that are false.



Calculate ALL the THEN conditions and ditto all the ELSE conditions.



Now copy into the result all the THEN values indexed by the TRUE above and
than all the ELSE values indicated by the FALSE above.



In plain English, make a result from two other results based on picking
either one from menu A or one from menu B.



That is not a bad algorithm and in a vectorized language like R, maybe even
quite effective and efficient. It does lots of extra work as by definition
it throws at least half away.



I suspect the implementation could be made much faster by making some of it
done internally using a language like C.



But now that I know what this implementation did, I might have some qualms
at using it in some situations. The original complaint led to other
observations and needs and perhaps blindly using a supplied function like
ifelse() may not be a decent solution for some needs.



I note how I had to reorient my work elsewhere using a group of packages
called the tidyverse when they added a function to allow rowwise
manipulation of the data as compared to an ifelse-like method using all
columns at once. There is room for many approaches and if a function may not
be doing quite what you want, something else may better meet your needs OR
you may want to see if you can copy the existing function and modify it for
your own personal needs.



In the case we mentioned, the goal was to avoid printing selected warnings.
Since the function is readable, it can easily be modified in a copy to find
what is causing the warnings and either rewrite a bit to avoid them or start
over with perhaps your own function that tests before doing things and
avoids tripping the condition (generating a NaN) entirely.



Like may languages, R is a bit too rich. You can piggyback on the work of
others but with some caution as they did not necessarily have you in mind
with what they created.






[[alternative HTML version deleted]]





--

Message: 4
Date: Sun, 10 Oct 2021 08:34:52 +1100
From: Jim Lemon 
To: Avi Gross 
Cc: r-help mailing list 
Subject: Re: [R

Re: [R] [External] Missing text in lattice key legend

2021-10-11 Thread Richard M. Heiberger
looks like a paren outof place.  the text is inside the points. it should be 
parallel to the points in the calling sequence.

Get Outlook for iOS

From: R-help  on behalf of Luigi Marongiu 

Sent: Monday, October 11, 2021 7:46:36 AM
To: r-help 
Subject: [External] [R] Missing text in lattice key legend

Hello,
I am drawing some data with lattice using:
```
library(lattice)
COLS = c("gold", "forestgreen", "darkslategray3", "purple")
xyplot(Value ~ Concentration,
   group = Substance, data = inf_avg,
   pch = 16, cex = 1.2, type = "b",
   xlab=expression(bold(paste("Concentration (", mu, "M)"))),
   ylab=expression(bold("Infection rate")),
   col=COLS,
   scales = list(x = list(log = 10, at=c(unique(inf_avg$Concentration))
  )
 ),
   key = list(space="top", columns=4, col = "black",
   points=list(pch=c(16, 16, 16, 16),
   col=COLS,
   text=list(c("6-PN", "8-PN", "IX", "XN")
)
   )
  ),
   panel = function(x,y) {
 panel.xyplot(x,y)
 errbar()
   }
)
```
It all works but the legend only shows the colored dots, there is no
text. Is it something missing from the syntax?
Thanks

--
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=04%7C01%7Crmh%40temple.edu%7C3f303633d643499924e208d98cacdaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C637695496888670932%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=98vQNMB7OcS%2B2R73ZMngEeg%2BP6PeP3oCAOUDHxs9SU8%3D&reserved=0
PLEASE do read the posting guide 
https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=04%7C01%7Crmh%40temple.edu%7C3f303633d643499924e208d98cacdaef%7C716e81efb52244738e3110bd02ccf6e5%7C0%7C0%7C637695496888670932%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=1yzXmD57qql7UXEKNqdK8Iq1vhfkUYf%2BpX8gvfvD3p0%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to add error bars to lattice xyplot

2021-10-11 Thread Deepayan Sarkar
On Mon, Oct 11, 2021 at 5:41 PM Luigi Marongiu  wrote:
>
> Hello,
> I am trying to plot data using lattice. The basic plot works:
> ```
> Substance = rep(c("A", "B", "C", "D"),4)
> Concentration = rep(1:4,4),
> Value = c(62.8067, 116.2633,  92.2600,   9.8733,  14.8233,
>   92.3733, 98.9567,   1.4833,   0.6467,  50.6600,
>   25.7533,   0.6900, 0.2167,   7.4067,   6.9200,
>   0.0633)
> df = data.frame(Substance, Concentration, Value, stringsAsFactors = FALSE)
> Value = c(15.2974126, 16.3196089, 57.4294280,  9.1943370, 20.5567321,
> 14.0874424,
>38.3626672, 0.3780653,  0.4738495, 37.9124874, 16.2473916,  0.7218726,
>0.2498666,  8.4537585, 10.8058456,  0.1096966)
> dfsd = data.frame(Substance, Concentration, Value, stringsAsFactors = FALSE)
>
> library(lattice)
> COLS = c("gold", "forestgreen", "darkslategray3", "purple")
> xyplot(Value ~ Concentration,
>group = Substance, data = df,
>pch = 16, cex = 1.2, type = "b",
>xlab=expression(bold(paste("Concentration (", mu, "M)"))),
>ylab=expression(bold("Infection rate")),
>col=COLS,
>scales = list(x = list(log = 10, at=c(unique(df$Concentration))
>)
>),
>key = list(space="top", columns=4, col = "black",
>   points=list(pch=c(16, 16, 16, 16),
>   col=COLS,
>   text=list(c("6-PN", "8-PN", "IX", "XN")
>   )
>   )
>)
>
> )
> ```
> but how do I add the error bars?
> I tried with
> ```
> xyplot(Value ~ Concentration,
>group = Substance, data = df,
>pch = 16, cex = 1.2, type = "b",
>xlab=expression(bold(paste("Concentration (", mu, "M)"))),
>ylab=expression(bold("Infection rate")),
>col=COLS,
>scales = list(x = list(log = 10, at=c(unique(df$Concentration))
>)
>),
>key = list(space="top", columns=4, col = "black",
>   points=list(pch=c(16, 16, 16, 16),
>   col=COLS,
>   text=list(c("6-PN", "8-PN", "IX", "XN")
>   )
>   )
>),
>panel = function (x,y,) {
>  panel.segments(x0 = df$Concentration, x1 = df$Concentration,
> y0 = df$Value - dfsd$Value,
> y1 = df$Value + dfsd$Value,
> col = COLS)
>}
>
> )
> ```
> but the bars are plotted outside the graph.

You need to apply the log-transformation yourself, e.g.,

 panel.segments(x0 = log10(df$Concentration), x1 =
log10(df$Concentration),

But this is not really a scalable approach. You should check if
Hmisc::xYplot suits your needs:

https://search.r-project.org/CRAN/refmans/Hmisc/html/xYplot.html

Best,
-Deepayan

> What is the correct syntax? can I use raw data instead of making the
> mean and std dev separately?
> Thanks
>
> --
> Best regards,
> Luigi
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Missing text in lattice key legend

2021-10-11 Thread Luigi Marongiu
Awesome, thanks!

On Mon, Oct 11, 2021 at 2:19 PM Deepayan Sarkar
 wrote:
>
> On Mon, Oct 11, 2021 at 5:17 PM Luigi Marongiu  
> wrote:
> >
> > Hello,
> > I am drawing some data with lattice using:
> > ```
> > library(lattice)
> > COLS = c("gold", "forestgreen", "darkslategray3", "purple")
> > xyplot(Value ~ Concentration,
> >group = Substance, data = inf_avg,
> >pch = 16, cex = 1.2, type = "b",
> >xlab=expression(bold(paste("Concentration (", mu, "M)"))),
> >ylab=expression(bold("Infection rate")),
> >col=COLS,
> >scales = list(x = list(log = 10, at=c(unique(inf_avg$Concentration))
> >   )
> >  ),
> >key = list(space="top", columns=4, col = "black",
> >points=list(pch=c(16, 16, 16, 16),
> >col=COLS,
> >text=list(c("6-PN", "8-PN", "IX", "XN")
> > )
> >)
> >   ),
> >panel = function(x,y) {
> >  panel.xyplot(x,y)
> >  errbar()
> >}
> > )
> > ```
> > It all works but the legend only shows the colored dots, there is no
> > text. Is it something missing from the syntax?
>
> Your text component is nested inside the points component. I think you
> want it outside, e.g.,
>
> xyplot(1 ~ 1,
>key = list(space="top", columns=4, col = "black",
>   points=list(pch=c(16, 16, 16, 16),
>   col=COLS),
>   text=list(c("6-PN", "8-PN", "IX", "XN"))
>   ))
>
> Best,
> -Deepayan
>
> > Thanks
> >
> > --
> > Best regards,
> > Luigi
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.



-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Missing text in lattice key legend

2021-10-11 Thread Deepayan Sarkar
On Mon, Oct 11, 2021 at 5:17 PM Luigi Marongiu  wrote:
>
> Hello,
> I am drawing some data with lattice using:
> ```
> library(lattice)
> COLS = c("gold", "forestgreen", "darkslategray3", "purple")
> xyplot(Value ~ Concentration,
>group = Substance, data = inf_avg,
>pch = 16, cex = 1.2, type = "b",
>xlab=expression(bold(paste("Concentration (", mu, "M)"))),
>ylab=expression(bold("Infection rate")),
>col=COLS,
>scales = list(x = list(log = 10, at=c(unique(inf_avg$Concentration))
>   )
>  ),
>key = list(space="top", columns=4, col = "black",
>points=list(pch=c(16, 16, 16, 16),
>col=COLS,
>text=list(c("6-PN", "8-PN", "IX", "XN")
> )
>)
>   ),
>panel = function(x,y) {
>  panel.xyplot(x,y)
>  errbar()
>}
> )
> ```
> It all works but the legend only shows the colored dots, there is no
> text. Is it something missing from the syntax?

Your text component is nested inside the points component. I think you
want it outside, e.g.,

xyplot(1 ~ 1,
   key = list(space="top", columns=4, col = "black",
  points=list(pch=c(16, 16, 16, 16),
  col=COLS),
  text=list(c("6-PN", "8-PN", "IX", "XN"))
  ))

Best,
-Deepayan

> Thanks
>
> --
> Best regards,
> Luigi
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to add error bars to lattice xyplot

2021-10-11 Thread Luigi Marongiu
Hello,
I am trying to plot data using lattice. The basic plot works:
```
Substance = rep(c("A", "B", "C", "D"),4)
Concentration = rep(1:4,4),
Value = c(62.8067, 116.2633,  92.2600,   9.8733,  14.8233,
  92.3733, 98.9567,   1.4833,   0.6467,  50.6600,
  25.7533,   0.6900, 0.2167,   7.4067,   6.9200,
  0.0633)
df = data.frame(Substance, Concentration, Value, stringsAsFactors = FALSE)
Value = c(15.2974126, 16.3196089, 57.4294280,  9.1943370, 20.5567321,
14.0874424,
   38.3626672, 0.3780653,  0.4738495, 37.9124874, 16.2473916,  0.7218726,
   0.2498666,  8.4537585, 10.8058456,  0.1096966)
dfsd = data.frame(Substance, Concentration, Value, stringsAsFactors = FALSE)

library(lattice)
COLS = c("gold", "forestgreen", "darkslategray3", "purple")
xyplot(Value ~ Concentration,
   group = Substance, data = df,
   pch = 16, cex = 1.2, type = "b",
   xlab=expression(bold(paste("Concentration (", mu, "M)"))),
   ylab=expression(bold("Infection rate")),
   col=COLS,
   scales = list(x = list(log = 10, at=c(unique(df$Concentration))
   )
   ),
   key = list(space="top", columns=4, col = "black",
  points=list(pch=c(16, 16, 16, 16),
  col=COLS,
  text=list(c("6-PN", "8-PN", "IX", "XN")
  )
  )
   )

)
```
but how do I add the error bars?
I tried with
```
xyplot(Value ~ Concentration,
   group = Substance, data = df,
   pch = 16, cex = 1.2, type = "b",
   xlab=expression(bold(paste("Concentration (", mu, "M)"))),
   ylab=expression(bold("Infection rate")),
   col=COLS,
   scales = list(x = list(log = 10, at=c(unique(df$Concentration))
   )
   ),
   key = list(space="top", columns=4, col = "black",
  points=list(pch=c(16, 16, 16, 16),
  col=COLS,
  text=list(c("6-PN", "8-PN", "IX", "XN")
  )
  )
   ),
   panel = function (x,y,) {
 panel.segments(x0 = df$Concentration, x1 = df$Concentration,
y0 = df$Value - dfsd$Value,
y1 = df$Value + dfsd$Value,
col = COLS)
   }

)
```
but the bars are plotted outside the graph.
What is the correct syntax? can I use raw data instead of making the
mean and std dev separately?
Thanks

-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Missing text in lattice key legend

2021-10-11 Thread Luigi Marongiu
Hello,
I am drawing some data with lattice using:
```
library(lattice)
COLS = c("gold", "forestgreen", "darkslategray3", "purple")
xyplot(Value ~ Concentration,
   group = Substance, data = inf_avg,
   pch = 16, cex = 1.2, type = "b",
   xlab=expression(bold(paste("Concentration (", mu, "M)"))),
   ylab=expression(bold("Infection rate")),
   col=COLS,
   scales = list(x = list(log = 10, at=c(unique(inf_avg$Concentration))
  )
 ),
   key = list(space="top", columns=4, col = "black",
   points=list(pch=c(16, 16, 16, 16),
   col=COLS,
   text=list(c("6-PN", "8-PN", "IX", "XN")
)
   )
  ),
   panel = function(x,y) {
 panel.xyplot(x,y)
 errbar()
   }
)
```
It all works but the legend only shows the colored dots, there is no
text. Is it something missing from the syntax?
Thanks

-- 
Best regards,
Luigi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unexpected behavior in apply

2021-10-11 Thread PIKAL Petr
Hi

it is not surprising at all.

from apply documentation

Arguments
X   
an array, including a matrix.

data.frame is not matrix or array (even if it rather resembles one)

So if you put a cake into oven you cannot expect getting fried potatoes from
it.

For data frames sapply or lapply is preferable as it is designed for lists
and data frame is (again from documentation)

A data frame is a list of variables of the same number of rows with unique
row names, given class "data.frame".

> sapply(d,function(x) all(x[!is.na(x)]<=3))
   d1d2d3 
FALSE  TRUE FALSE 

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of Jiefei Wang
> Sent: Friday, October 8, 2021 8:22 PM
> To: Derickson, Ryan, VHA NCOD 
> Cc: r-help@r-project.org
> Subject: Re: [R] unexpected behavior in apply
> 
> Ok, it turns out that this is documented, even though it looks surprising.
> 
> First of all, the apply function will try to convert any object with the
dim
> attribute to a matrix(my intuition agrees with you that there should be no
> conversion), so the first step of the apply function is
> 
> > as.matrix.data.frame(d)
>  d1  d2  d3
> [1,] "a" "1" NA
> [2,] "b" "2" NA
> [3,] "c" "3" " 6"
> 
> Since the data frame `d` is a mixture of character and non-character
values,
> the non-character value will be converted to the character using the
function
> `format`. However, the problem is that the NA value will also be formatted
to
> the character
> 
> > format(c(NA, 6))
> [1] "NA" " 6"
> 
> That's where the space comes from. It is purely for making the result
pretty...
> The character NA will be removed later, but the space is not stripped. I
would
> say this is not a good design, and it might be worth not including the NA
value
> in the format function. At the current stage, I will suggest using the
function
> `lapply` to do what you want.
> 
> > lapply(d, FUN=function(x)all(x[!is.na(x)] <= 3))
> $d1
> [1] FALSE
> $d2
> [1] TRUE
> $d3
> [1] FALSE
> 
> Everything should work as you expect.
> 
> Best,
> Jiefei
> 
> On Sat, Oct 9, 2021 at 2:03 AM Jiefei Wang  wrote:
> >
> > Hi,
> >
> > I guess this can tell you what happens behind the scene
> >
> >
> > > d<-data.frame(d1 = letters[1:3],
> > +   d2 = c(1,2,3),
> > +   d3 = c(NA,NA,6))
> > > apply(d, 2, FUN=function(x)x)
> >  d1  d2  d3
> > [1,] "a" "1" NA
> > [2,] "b" "2" NA
> > [3,] "c" "3" " 6"
> > > "a"<=3
> > [1] FALSE
> > > "2"<=3
> > [1] TRUE
> > > "6"<=3
> > [1] FALSE
> >
> > Note that there is an additional space in the character value " 6",
> > that's why your comparison fails. I do not understand why but this
> > might be a bug in R
> >
> > Best,
> > Jiefei
> >
> > On Sat, Oct 9, 2021 at 1:49 AM Derickson, Ryan, VHA NCOD via R-help
> >  wrote:
> > >
> > > Hello,
> > >
> > > I'm seeing unexpected behavior when using apply() compared to a for
> loop when a character vector is part of the data subjected to the apply
> statement. Below, I check whether all non-missing values are <= 3. If I
> include a character column, apply incorrectly returns TRUE for d3. If I
only
> pass the numeric columns to apply, it is correct for d3. If I use a for
loop, it is
> correct.
> > >
> > > > d<-data.frame(d1 = letters[1:3],
> > > +   d2 = c(1,2,3),
> > > +   d3 = c(NA,NA,6))
> > > >
> > > > d
> > >   d1 d2 d3
> > > 1  a  1 NA
> > > 2  b  2 NA
> > > 3  c  3  6
> > > >
> > > > # results are incorrect
> > > > apply(d, 2, FUN=function(x)all(x[!is.na(x)] <= 3))
> > >d1d2d3
> > > FALSE  TRUE  TRUE
> > > >
> > > > # results are correct
> > > > apply(d[,2:3], 2, FUN=function(x)all(x[!is.na(x)] <= 3))
> > >d2d3
> > >  TRUE FALSE
> > > >
> > > > # results are correct
> > > > for(i in names(d)){
> > > +   print(all(d[!is.na(d[,i]),i] <= 3)) }
> > > [1] FALSE
> > > [1] TRUE
> > > [1] FALSE
> > >
> > >
> > > Finally, if I remove the NA values from d3 and include the character
> column in apply, it is correct.
> > >
> > > > d<-data.frame(d1 = letters[1:3],
> > > +   d2 = c(1,2,3),
> > > +   d3 = c(4,5,6))
> > > >
> > > > d
> > >   d1 d2 d3
> > > 1  a  1  4
> > > 2  b  2  5
> > > 3  c  3  6
> > > >
> > > > # results are correct
> > > > apply(d, 2, FUN=function(x)all(x[!is.na(x)] <= 3))
> > >d1d2d3
> > > FALSE  TRUE FALSE
> > >
> > >
> > > Can someone help me understand what's happening?
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and pro