Re: [R] prop.trend.test

2023-09-07 Thread Thomas Subia via R-help
Colleagues,

Thanks all for the responses. 

I am monitoring the daily total number of defects per sample unit. 
I need to know whether this daily defect proportion is trending upward (a bad 
thing for a manufacturing process).

My first thought was to use either a u or a u' control chart for this.
As far as I know, u or u' charts are poor to detect drifts. 

This is why I chose to use prop.trend.test to detect trends in proportions.

While prop.trend.test can confirm the existence of a trend, as far as I know, 
it is left to the user
to determine what direction that trend is. 

One way to illustrate trending is of course to plot the data and use 
geom_smooth and method lm
For the non-statisticians in my group, I've found that using this method along 
with the p-value of prop.trend.test, makes it easier for the users to determine 
the existence of trending and its direction.

If there are any other ways to do this, please let me know.

Thomas Subia












On Thursday, September 7, 2023 at 10:31:27 AM PDT, Rui Barradas 
 wrote: 





Às 14:23 de 07/09/2023, Thomas Subia via R-help escreveu:
> 
> Colleagues
> 
>   Consider
> smokers  <- c( 83, 90, 129, 70 )
> patients <- c( 86, 93, 136, 82 )
> 
>   prop.trend.test(smokers, patients)
> 
>   Output:
> 
>   Chi-squared Test for Trend inProportions
> 
>   data:  smokers out of patients ,
> 
> using scores: 1 2 3 4
> 
> X-squared = 8.2249, df = 1, p-value = 0.004132
> 
>   # trend test for proportions indicates proportions aretrending.
> 
>   How does one identify the direction of trending?
>   # prop.test indicates that the proportions are unequal but doeslittle to 
>indicate trend direction.
> All the best,
> Thomas Subia
> 
> 
>     [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hello,

By visual inspection it seems that there is a decreasing trend.
Note that the sample estimates of prop.test and smokers/patients are equal.


smokers  <- c( 83, 90, 129, 70 )
patients <- c( 86, 93, 136, 82 )

prop.test(smokers, patients)$estimate
#>    prop 1    prop 2    prop 3    prop 4
#> 0.9651163 0.9677419 0.9485294 0.8536585

smokers/patients

#> [1] 0.9651163 0.9677419 0.9485294 0.8536585

plot(smokers/patients, type = "b")



Hope this helps,

Rui Barradas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] prop.trend.test

2023-09-07 Thread Thomas Subia via R-help


Colleagues

 Consider
smokers  <- c( 83, 90, 129, 70 )
patients <- c( 86, 93, 136, 82 )

 prop.trend.test(smokers, patients)

 Output:

 Chi-squared Test for Trend inProportions

 data:  smokers out of patients ,

using scores: 1 2 3 4

X-squared = 8.2249, df = 1, p-value = 0.004132

 # trend test for proportions indicates proportions aretrending.

 How does one identify the direction of trending?
 # prop.test indicates that the proportions are unequal but doeslittle to 
indicate trend direction.
All the best,
Thomas Subia


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] geom_smooth

2023-08-12 Thread Thomas Subia via R-help
Colleagues,

Your suggestions are elegant and greatly appreciated.

Thomas Subia






On Friday, August 11, 2023 at 11:08:42 PM PDT, Berwin A Turlach 
 wrote: 





G'day Thomas,

On Sat, 12 Aug 2023 04:17:42 + (UTC)
Thomas Subia via R-help  wrote:

> Here is my reproducible code for a graph using geom_smooth

The call "library(tidyverse)" was missing. :)

> I'd like to add a black boundary around the shaded area. I suspect
> this can be done with geom_ribbon but I cannot figure this out. Some
> advice would be welcome.

This works for me:

ggplot(scatter_data,aes(x=x_var,y=y_var,))+
  geom_point()+
  geom_smooth(se=TRUE,fill="blue",color="black",linetype="dashed") +
  geom_ribbon(stat="smooth", aes(ymin=after_stat(ymin), ymax=after_stat(ymax)), 
fill=NA, color="black")+

  theme_cowplot()


Cheers,
    
    Berwin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] geom_smooth

2023-08-11 Thread Thomas Subia via R-help
Colleagues,

Here is my reproducible code for a graph using geom_smooth
set.seed(55)
scatter_data <- tibble(x_var = runif(100, min = 0, max = 25)
   ,y_var = log2(x_var) + rnorm(100))

library(ggplot2)
library(cowplot)

ggplot(scatter_data,aes(x=x_var,y=y_var))+
  geom_point()+
  geom_smooth(se=TRUE,fill="blue",color="black",linetype="dashed")+
  theme_cowplot()

I'd like to add a black boundary around the shaded area. I suspect this can be 
done with geom_ribbon but I cannot figure this out. Some advice would be 
welcome.

Thanks!

Thomas Subia

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotly

2023-07-25 Thread Thomas Subia via R-help
 Colleagues,
Here is my reproducible code.
library(plotly)

t <- list(
  family = "Arial",
  size = 12,
  color = "black",
  face="bold")


t3 <- list(
  family = "Arial",
  size = 12,
  color = "black",
  face="bold")

t5 <- list(
  family = "Arial",
  size = 12,
  color = "black",
  face="bold")

plot_ly(
  domain = list(x = c(0, .5), y = c(0,0.5)),
  value = 319,
  title = list(text = "Example\n4,800 Target",font=t),
  type = "indicator",
  mode = "gauge+number+delta",
  number = list(valueformat = "%.2f",font=t5), 
  delta = list(reference = 4800, valueformat = "%.2f",font=t3),
  gauge = list(
    axis = list(range = list(NULL, 9600), 
    tickfont = list(size = 12, face="bold",family = "Arial")),
    bar = list(color = "black"),
    steps = list(
  list(range = c(0, 4800), color = 'AAFF00'),
  list(range = c(4800, 9600), color = 'ffb2b2')),
    threshold = list(
  line = list(color = "black", width = 8),
  thickness = 0.75,
  value = 4800)))
Question 1:How can I save the plotly graph with a specific graph size?Using 
ggsave, one can define the height, width and unit of measure, is there 
something similar in plotly?Since I am creating the same gauge plot but with 
different data, I'd like all the resulting graphs to have the same size. I've 
googled this but to no avail,
Question 2:How can I adjust the tick marks of the gauge? Ideally the tick marks 
should be set at seq(0,9600,1200)
Thanks everyone!
Thomas Subia

On Saturday, July 22, 2023 at 08:10:52 AM PDT, Boris Steipe 
 wrote:  
 
 ChatGPT-4:

-- Query:  --

Thank you. The `value` in the code example is 2874. The plot shows a large 
number at the center and a smaller value computed as a delta relative to the 
`reference` of 4800. But the large value is given as 2870, and the smaller 
value is given as -1930 i.e. both values are rounded. Can I control the 
precision of these two numbers?


-- Response:  ---

In Plotly, you can control the precision of the numbers by specifying the 
`valueformat` and `delta.valueformat` parameters. These parameters accept a 
string in the format that is used by the C `printf` function. For example, 
`%.2f` specifies a number with two digits after the decimal point. Here is an 
example:

```R
plotly::plot_ly(
  domain = list(x = c(0, 1), y = c(0, 1)),
  value = 2874,
  title = list(text = "Generic"),
  type = "indicator",
  mode = "gauge+number+delta",
  number = list(valueformat = "%.2f"), 
  delta = list(reference = 4800, valueformat = "%.2f"),
  gauge = list(
    axis = list(range = list(NULL, 5000), 
                tickfont = list(size = 20, family = "Arial")),
    bar = list(color = "blue"),
    steps = list(
      list(range = c(0, 4800), color = "white"),
      list(range = c(4800, 6000), color = "red")),
    threshold = list(
      line = list(color = "black", width = 6),
      thickness = 0.75,
      value = 4800)))
```
In this example, both the main number and the delta value will be displayed 
with two decimal places. You can adjust these strings to display the numbers 
with the precision you prefer.

-

Here you go. I have no idea how it does this, and have a discussion on elements 
of "Sein und Zeit" and book 19 of the Odyssey, and tutor me in Italian 
conversation, and draft Mermaid mind-maps - but here we are.


:-)






> 
> \In this example, the value was 2324, the gauge displays 2320.
> I'm not sure why this occurs. Changing the statement to 56, results in the 
> gauge reading 60. I'm not sure what needs to be changed in the script or the 
> environment to stop rounding.




> On Jul 22, 2023, at 10:43, Boris Steipe  wrote:
> 
> What do you mean "Rounded"?
> What do you expect, what do you get instead?
> 
> ?
> 
> 
> 
> 
> 
>> On Jul 22, 2023, at 10:40, Thomas Subia via R-help  
>> wrote:
>> 
>> Colleagues,
>> Thanks for the update.
>> My colleagues at work have run this script but the resulting graph output 
>> for value is rounded. How can one turn this annoying feature off?
>> I've googled this but to no avail.
>> 
>> 
>> 
>> 
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducibl

Re: [R] plotly

2023-07-22 Thread Thomas Subia via R-help
 Many thanks for this! This is an elegant solution!
After looking at my script, I still do not understand is how the rounding 
occurs. After reading the plotly documentation, unless I am missing something, 
this wasn't covered.


On Saturday, July 22, 2023 at 08:10:52 AM PDT, Boris Steipe 
 wrote:  
 
 ChatGPT-4:

-- Query:  --

Thank you. The `value` in the code example is 2874. The plot shows a large 
number at the center and a smaller value computed as a delta relative to the 
`reference` of 4800. But the large value is given as 2870, and the smaller 
value is given as -1930 i.e. both values are rounded. Can I control the 
precision of these two numbers?


-- Response:  ---

In Plotly, you can control the precision of the numbers by specifying the 
`valueformat` and `delta.valueformat` parameters. These parameters accept a 
string in the format that is used by the C `printf` function. For example, 
`%.2f` specifies a number with two digits after the decimal point. Here is an 
example:

```R
plotly::plot_ly(
  domain = list(x = c(0, 1), y = c(0, 1)),
  value = 2874,
  title = list(text = "Generic"),
  type = "indicator",
  mode = "gauge+number+delta",
  number = list(valueformat = "%.2f"), 
  delta = list(reference = 4800, valueformat = "%.2f"),
  gauge = list(
    axis = list(range = list(NULL, 5000), 
                tickfont = list(size = 20, family = "Arial")),
    bar = list(color = "blue"),
    steps = list(
      list(range = c(0, 4800), color = "white"),
      list(range = c(4800, 6000), color = "red")),
    threshold = list(
      line = list(color = "black", width = 6),
      thickness = 0.75,
      value = 4800)))
```
In this example, both the main number and the delta value will be displayed 
with two decimal places. You can adjust these strings to display the numbers 
with the precision you prefer.

-

Here you go. I have no idea how it does this, and have a discussion on elements 
of "Sein und Zeit" and book 19 of the Odyssey, and tutor me in Italian 
conversation, and draft Mermaid mind-maps - but here we are.


:-)






> 
> \In this example, the value was 2324, the gauge displays 2320.
> I'm not sure why this occurs. Changing the statement to 56, results in the 
> gauge reading 60. I'm not sure what needs to be changed in the script or the 
> environment to stop rounding.




> On Jul 22, 2023, at 10:43, Boris Steipe  wrote:
> 
> What do you mean "Rounded"?
> What do you expect, what do you get instead?
> 
> ?
> 
> 
> 
> 
> 
>> On Jul 22, 2023, at 10:40, Thomas Subia via R-help  
>> wrote:
>> 
>> Colleagues,
>> Thanks for the update.
>> My colleagues at work have run this script but the resulting graph output 
>> for value is rounded. How can one turn this annoying feature off?
>> I've googled this but to no avail.
>> 
>> 
>> 
>> 
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotly

2023-07-22 Thread Thomas Subia via R-help
Colleagues,
Thanks for the update.
My colleagues at work have run this script but the resulting graph output for 
value is rounded. How can one turn this annoying feature off?
I've googled this but to no avail.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotly question

2023-07-21 Thread Thomas Subia via R-help
Colleagues

Here is my reproducible code

plot_ly(
  domain = list(x = c(0, 1), y = c(0, 1)),
  value = 2874,
  title = list(text = "Generic"),
  type = "indicator",
  mode = "gauge+number+delta",
  delta = list(reference = 4800),
  gauge = list(
    axis =list(range = list(NULL, 5000)),
    steps = list(
    list(range = c(0, 4800), color = "white"),
    list(range = c(4800, 6000), color = "red")),
    threshold = list(
    line = list(color = "black", width = 6),
    thickness = 0.75,
    value = 4800)))

How can I change the indicator color from green to some other color?

How can I change the typeface and font size of the speedometer tick mark font 
size?

Thomas Subia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] prop.test question

2023-07-11 Thread Thomas Subia via R-help
Colleagues
Consider:smokers  <- c( 83, 90, 129, 70 )
patients <- c( 86, 93, 136, 82 )
prop.test(smokers, patients)
    4-sample test for equality of proportions
    without continuity correction

data:  smokers out of patients
X-squared = 12.6, df = 3, p-value = 0.005585
alternative hypothesis: two.sided
sample estimates:
   prop 1    prop 2    prop 3    prop 4 
0.9651163 0.9677419 0.9485294 0.8536585 

Now:prop.trend.test(smokers, patients)    Chi-squared Test for Trend in 
Proportions

data:  smokers out of patients ,
 using scores: 1 2 3 4
X-squared = 8.2249, df = 1, p-value = 0.004132
How can one determine the direction of the trend?
Thomas Subia




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [External] latticeExtra

2023-06-05 Thread Thomas Subia via R-help
 Colleagues,
Thanks for the help!
Root cause of the problem was not to define z and x as factors!Now I know 
better.
All the best,
Thomas Subia


On Monday, June 5, 2023 at 08:45:39 PM PDT, Richard M. Heiberger 
 wrote:  
 
 This works.
> d$zz <- factor(d$z, levels=c("low","med","high"))
> d$xx <- as.factor(d$x)
> cloud(y~xx+zz, d, panel.3d.cloud=panel.3dbars, col.facet='grey', 
+      xbase=0.4, ybase=0.4, scales=list(arrows=FALSE, col=1), 
+      par.settings = list(axis.line = list(col = "transparent")))
> 

the default levels for factor are alphabetic.  That is ok for d$x.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] latticeExtra

2023-06-05 Thread Thomas Subia via R-help
Colleagues,

I am trying to create a 3D barplot using the following script

d <- read.table(text=' x   y z
t1   5   high
t1   2   low
t1   4   med
t2   8   high
t2   1   low
t2   3   med
t3  50   high
t3  12   med
t3  35   low', header=TRUE)

library(latticeExtra)

cloud(y~x+z, d, panel.3d.cloud=panel.3dbars, col.facet='grey', 
  xbase=0.4, ybase=0.4, scales=list(arrows=FALSE, col=1), 
  par.settings = list(axis.line = list(col = "transparent")))

Executing this results in this error message

Error using packet 1
non-numeric argument to binary operator

I suspect that this error stems from read.table.

This graph is easily done with Excel but I'd rather use R

Any help would be appreciated.

Thanks

Thomas Subia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Pairwise comparison test

2023-05-12 Thread Thomas Subia via R-help
Colleague,

smokers  <- c( 83, 90, 129, 70 )
patients <- c( 86, 93, 136, 82 )
pairwise.prop.test(smokers, patients)

# Output

Pairwise comparisons using Pairwise comparison of proportions 

data:  smokers out of patients 

   1    2    3
2 1.000 -     -
3 1.000 1.000 -
4 0.119 0.093 0.124

Is there a way to generate a graph which illustrates the pairwise comparison of 
proportions?

Thomas Subia

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Shadow Graphics Error in R Studio

2023-01-26 Thread Thomas Subia via R-help
Brinkley,

I am using R studio with
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

I cannot reproduce your error messages.
That being said, you might want to look at:
 
https://github.com/rstudio/rstudio/issues/2214 
https://stackoverflow.com/questions/19513705/error-in-rstudiogd-shadow-graphics-device-error-r-error-4-r-code-execution

Hope this helps!

Thomas Subia

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairwise.var.test

2022-10-30 Thread Thomas Subia via R-help
Colleagues,

Thank you all for the timely suggestions. That is appreciated.

What I am really looking for a way to identify difference in group level 
variance by using multiple comparison intervals. Minitab displays those results 
in a graph.

This method is described in:
https://support.minitab.com/en-us/minitab/20/media/pdfs/translate/Multiple_Comparisons_Method_Test_for_Equal_Variances.pdf

I was hoping that R had something similar.

I tried a Google search on this but to no avail.

Thomas Subia






On Sunday, October 30, 2022 at 03:44:54 PM PDT, Rui Barradas 
 wrote: 





Às 21:47 de 30/10/2022, Jim Lemon escreveu:
> Hi Thomas,
> I have assumed the format of your p-value matrix. This may require
> some adjustment.
> 
>    A          B          C        D          E          F
> A 1          0.7464    0.0187    0.0865      0.0122      0.4693
> B 0.7464    1          0.0358    0.1502      0.0173      0.3240
> C 0.0187    0.0358    1        0.5131      0.7185      0.0050
> D 0.0865    0.1502    0.5131    1          0.3240      0.0173
> E 0.0122    0.0173    0.7185    0.3240      1          0.0029
> F 0.4693    0.3240    0.0050    0.0173      0.0029      1
> 
> pvar.mat<-as.matrix(read.table(text=
>  "1          0.7464    0.0187    0.0865      0.0122      0.4693
>  0.7464    1          0.0358    0.1502      0.0173      0.3240
>  0.0187    0.0358    1        0.5131      0.7185      0.0050
>  0.0865    0.1502    0.5131    1          0.3240      0.0173
>  0.0122    0.0173    0.7185    0.3240      1          0.0029
>  0.4693    0.3240    0.0050    0.0173      0.0029      1",
>  stringsAsFactors=FALSE))
> rownames(pvar.mat)<-colnames(pvar.mat)<-LETTERS[1:6]
> pvar.col<-matrix(NA,nrow=6,ncol=6)
> pvar.col[pvar.mat < 1]<-"red"
> pvar.col[pvar.mat < 0.05]<-"orange"
> pvar.col[pvar.mat < 0.01]<-"green"
> library(plotrix)
> par(mar=c(6,4,4,2))
> color2D.matplot(pvar.mat,cellcolors=pvar.col,
>  main="P-values for matrix",axes=FALSE)
> axis(1,at=seq(0.5,5.5,by=1),labels=LETTERS[1:6])
> axis(2,at=seq(0.5,5.5,by=1),labels=rev(LETTERS[1:6]))
> color.legend(0,-1.3,2.5,-0.7,c("NA","NS","<0.05","<0.01"),
>  rect.col=c(NA,"red","orange","green"))
> 
> Jim
> 
> On Mon, Oct 31, 2022 at 6:34 AM Thomas Subia via R-help
>  wrote:
>>
>> Colleagues,
>>
>> The RVAideMemoire package has a pairwise variance test which one can use to 
>> identify variance differences between group levels.
>>
>> Using the example from this package, 
>> pairwise.var.test(InsectSprays$count,InsectSprays$spray), we get this output:
>>
>>      Pairwise comparisons using F tests to compare two variances
>>
>> data:  InsectSprays$count and InsectSprays$spray
>>
>>    A              B            C            D            E
>> B 0.7464    -              -              -              -
>> C 0.0187    0.0358    -      -      -
>> D 0.0865    0.1502    0.5131    -            -
>> E 0.0122    0.0173    0.7185    0.3240    -
>> F 0.4693    0.3240    0.0050    0.0173    0.0029
>>
>> P value adjustment method: fdr
>>
>> Is there a way to graph the pairwise variance differences so that users can 
>> easily identify the statistically significant variance differences between 
>> group levels?
>>
>> I can do this using Minitab but I'd prefer using R for this.
>>
>> Thomas Subia
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


Hello,

With Jim's data creation code, here is a ggplot graph.

First coerce to data.frame, then reshape to long format.
Now bin the p-values with the cutpoints 0.01, 0.05 and 1. This is dne 
with ?findInterval.

The colors are assigned in the plot code, based on the binned p.values 
above.


library(ggplot2)

pvar.mat |> as.data.frame() -> pvar.df
pvar.df$id <- row.names(pvar.df)

pvar.df |> tidyr::pivot_longer(-id, values_to = "p.value") -> pvar.long
i <- findInterval(pvar.long$p.value, c(0, 0.01, 0.05, 1))
pvar.long$p.value <- c("<0.01&q

[R] pairwise.var.test

2022-10-30 Thread Thomas Subia via R-help
Colleagues,

The RVAideMemoire package has a pairwise variance test which one can use to 
identify variance differences between group levels.

Using the example from this package, 
pairwise.var.test(InsectSprays$count,InsectSprays$spray), we get this output:

    Pairwise comparisons using F tests to compare two variances

data:  InsectSprays$count and InsectSprays$spray

   A  B     C     D    E     
B 0.7464 -   -      -  -     
C 0.0187 0.0358 -  -   -     
D 0.0865 0.1502 0.5131 - -     
E 0.0122 0.0173 0.7185 0.3240    -     
F 0.4693 0.3240 0.0050 0.0173 0.0029

P value adjustment method: fdr

Is there a way to graph the pairwise variance differences so that users can 
easily identify the statistically significant variance differences between 
group levels?

I can do this using Minitab but I'd prefer using R for this.

Thomas Subia

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How long does it take to learn the R programming language?

2022-09-28 Thread Thomas Subia via R-help
The length of time it takes to learn R is dependent on what you want to use R 
for.
Let's assume you want to use R to perform basic statistical analyses on your 
own.

IMHO, the best book for self-study for this is Andy Field's book, Discovering 
Statistics using R, It's the best book because it gives you all the necessary 
background for using basic statistical tests and the associated R scripts 
needed to perform them. 

You might also want to consider Winston Chang's The R  Graphics Cookbook, 
statistical analyses often need graphs to illustrate the data. This is the best 
book to teach you this because it gives you all the necessary R scripts to 
achieve this. 

The learning curve for learning basic statistical analyses using R is steep 
relative to SPSS and Minitab. SPSS and Minitab use GUIs to easily guide the 
user to the desired statistical analysis. Typically the user does not have to 
write commands to do this. R does offer R Commander which has a GUI to guide 
the user to data input, graphing and statistical analyses. You may find this 
useful. I suspect that as you become more comfortable using R, you will become 
less dependent on R Commander for your analyses. 

Hope this helps!

Thomas Subia


 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] transformTukey

2021-12-15 Thread Thomas Subia via R-help
Colleagues,

I attempted to copy data from the clipboard and use rcompanion's
transformTukey command in an attempt to normalize the dataset.

data = read.delim("clipboard")

head(data)
  Flatness
17e-04
21e-03
38e-04
45e-04
55e-04
65e-04

All data are greater than 0.
Data set is heavily right skewed

> library(rcompanion)
> trans_data <- transformTukey(data)

Error in is.infinite(TRANS) : 
  default method not implemented for type 'list'

Why is this occurring?

Thomas Subia

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] . Re: Splitting a data column randomly into 3 groups

2021-09-04 Thread Thomas Subia via R-help
I was wondering if this is a good alternative method to split a data column
into distinct groups.
Let's say I want my first group to have 4 elements selected randomly

mydata <- LETTERS[1:11] 
 random_grp <- sample(mydata,4,replace=FALSE)

Now random_grp is:
> random_grp
[1] "H" "E" "A" "D"
# How's that for a random selection!

Now my choices for another group of random data now becomes:
 data_wo_random <- setdiff(mydata,random_grp)

> data_wo_random
[1] "B" "C" "F" "G" "I" "J" "K"

Now from this reduced dataset, I can generate another random selection with
any size I choose.

One problem with this is that this is cumbersome when ones original dataset
is large or when one wants to subgroup the original dataset into many
different subgroup sizes.

Nevertheless, it's an intuitive method which is relatively easy to
understand

Hope this helps!

Thomas Subia
Statistician

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] uniroot

2021-08-27 Thread Thomas Subia via R-help
Colleagues,

I've been using uniroot to identify a root of an equation. 
As a check, I always verify that calculated root. 
This is where I need some help.

Consider the following script

fun <- function(x) {x^x -23}

# Clearly the root lies somewhere between 2.75 and 3.00

uniroot(fun, lower = 2.75, upper = 3.00,  tol = 0.001)

# output
$root
[1] 2.923125

$f.root
[1] 0.0001136763

# Let's verify this root.

2.923125^2.923125 - 23

0.000125

This result is different than what was calculated with uniroot
0.000125# verified check using x = 2.923125
0.0001136763# using $f.root

Does this imply that the root output of  2.923125 may need more significant
digits displayed?

I suspect that whatever root is calculated, that root may well be dependent
on what interval one defines where the root may occur
and what tolerance one has input.
I am not sure that is the case, nevertheless, it's worth asking the
question.

Some guidance would be appreciated.

Thanks!

Thomas Subia

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sin curve question

2021-07-24 Thread Thomas Subia via R-help
Colleagues,

Here is my code which plots sin(x) vs x, for angles between 0 and 180
degrees.

library(ggplot2)
library(REdaS)
copdat$degrees <- c(0,45,90,135,180)
copdat$radians <- deg2rad(copdat$degrees)
copdat$sin_x <- sin(copdat$radians)

ggplot(copdat,aes(x=degrees,y=sin_x))+
  geom_point(size = 2)+ geom_line()+
  theme_cowplot()+xlab("x")+
  ylab("sin(x)")+
  scale_x_continuous(breaks=seq(0,180,30))+
  ggtitle("sin(x) vs x\nx is in degrees")

My trig students would prefer a curved line plot similar to what can be
plotted with Excel smooth line functionality.
I wanted to provide a relatively simple R script using ggplot to do this
without having to resort to fitting a sine curve to these points.

Some guidance would be appreciated.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Spie charts

2021-03-29 Thread Thomas Subia via R-help
Ferri,
Radar Charts are often used to compare two or more items or groups on various 
features or characteristics. However, as the number of groups increases, the 
user has a harder time making comparisons between groups. As the number of 
groups increase, the number of spokes of the radar chart increases. Users in 
general have a harder time comparing radial distances.You're better off using a 
scatter plot for this. If however, you still want to create a radar chart, you 
may want to the fmsb package. 

Hope this helps!
Thomas SubiaStatistician


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readxl question

2020-08-21 Thread Thomas Subia via R-help
Colleagues,

 

I have 250 Excel files in a directory. Each of those files has the same
layout. The problem is that the data in each Excel data is not in
rectangular form. I've been using readxl to extract the data which I need.
Each of my metrics are stored in a particular cell. For each metric, I
create text files which stores my metrics.

 

library(plyr)

library(readxl)

 

files <- list.files(pattern="*.xls", full.names = FALSE)

 

# Extract Work Order

WO <- lapply(files, read_excel, sheet="Sheet1", range=("B9")) WO_list <-
as.data.frame(WO)

trans_WO <- t(WO_list)

 

write.table(trans_WO ,"WO.txt")

 

# Extract bubble 14_1

BUBBLE_14_1 <- lapply(files, read_excel, sheet="Sheet1", range=("c46"))
BUBBLE_14_1_list <- as.data.frame(BUBBLE_14_1)

trans_BUBBLE_14_1 <- t(BUBBLE_14_1_list)

 

write.table(trans_BUBBLE_14_1,"BUBBLE_14_1.txt")

 

 

# Extract bubble 14_2

BUBBLE_14_2 <- lapply(files, read_excel, sheet="Sheet1", range=("c62"))
BUBBLE_14_2_list <- as.data.frame(BUBBLE_14_2)

trans_BUBBLE_14_2 <- t(BUBBLE_14_2_list)

write.table(trans_BUBBLE_14_2,"BUBBLE_14_2.txt")

 

After the text files have been created, I cut and paste each column of data
to Excel.

This has worked fine if the number of cells I am extracting from a file is
small.

If the number gets larger, this method is inefficient.

 

Any advice on how to do this would be appreciated.

 

All the best,

 

Thomas Subia


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] readxl question

2020-08-21 Thread Thomas Subia via R-help
Colleagues,

 

I have 250 Excel files in a directory. Each of those files has the same
layout. The problem is that the data in each Excel data is not in
rectangular form. I've been using readxl to extract the data which I need.
Each of my metrics are stored in a particular cell. For each metric, I
create text files which stores my metrics.

 

library(plyr)

library(readxl)

 

files <- list.files(pattern="*.xls", full.names = FALSE)

 

# Extract Work Order

WO <- lapply(files, read_excel, sheet="Sheet1", range=("B9")) WO_list <-
as.data.frame(WO) trans_WO <- t(WO_list) write.table(trans_WO ,"WO.txt")

 

# Extract bubble 14_1

BUBBLE_14_1 <- lapply(files, read_excel, sheet="Sheet1", range=("c46"))
BUBBLE_14_1_list <- as.data.frame(BUBBLE_14_1)

trans_BUBBLE_14_1 <- t(BUBBLE_14_1_list)

 

write.table(trans_BUBBLE_14_1,"BUBBLE_14_1.txt")

 

 

# Extract bubble 14_2

BUBBLE_14_2 <- lapply(files, read_excel, sheet="Sheet1", range=("c62"))
BUBBLE_14_2_list <- as.data.frame(BUBBLE_14_2)

trans_BUBBLE_14_2 <- t(BUBBLE_14_2_list)

write.table(trans_BUBBLE_14_2,"BUBBLE_14_2.txt")

 

After the text files have been created, I cut and paste the contents of each
text file to Excel.

This has worked fine if the number of cells I am extracting from a file is
small.

If the number gets larger, this method is inefficient.

 

Any advice on how to do this would be appreciated.

 

All the best,

 

Thomas Subia


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Grouping Question

2020-03-21 Thread Thomas Subia via R-help
Colleagues,

Here is my dataset.

Serial  Measurement Meas_test   Serial_test
1   17  failfail
1   16  passfail
2   12  passpass
2   8   passpass
2   10  passpass
3   19  failfail
3   13  passpass

If a measurement is less than or equal to 16, then Meas_test is pass. Else
Meas_test is fail
This is easy to code.

Serial_test is a pass, when all of the Meas_test are pass for a given
serial. Else Serial_test is a fail.
I'm at a loss to figure out how to do this in R.

Some guidance would be appreciated.

All the best,

Thomas Subia

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combining text files

2019-12-03 Thread Thomas Subia via R-help
Colleagues,
I've got several text files which contain data for each metric I need to report 
on.One text file contains the serial number data. Another has customer and work 
order number. Another has test data. All text files have the same number of 
rows but all have different numbers of columns.

I was thinking about using bind_cols() to do this. 

Am I on the right track here?
All the best,
Thomas SubiaStatistician / Sr. Quality EngineerIMG Precision Inc.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extract lines from pdf files

2019-11-20 Thread Thomas Subia via R-help
Thanks all for the help. I appreciate the feedback
I've developed another method to extract my desired data from multiple pdfs in 
a directory.

# Combine all pdfs to a combined pdf
files <- list.files(pattern = "pdf$")
pdf_combine(files, output = "joined.pdf")

# creates a text file from joined.pdf
pdf_text("joined.pdf")
txt <- pdf_text("joined.pdf")
write.table(txt,file="mydata.txt")

# I need to extract the lines which match a line beginning with AMAT
lines <- readLines("mydata.txt")
date <- grep("AMAT",lines)

# output for date looks like [1]   6  62 118 174 230 286 342 398
# These are exactly the line positions I need.

Now that I've got the desired lines, I don't know how to extract the data from 
those lines.

Any advice would be appreciated.

All the best,

Thomas Subia
Statistician / Quality Engineer
IMG Precision Inc.








On Wednesday, November 20, 2019, 07:58:08 AM PST, Eric Berger 
 wrote: 





Hi Thomas,
As Jeff wrote, your HTML email is difficult to read. This is a "plain
text" forum.
As for "pointers", here is one suggestion.
Since you write that you can do the necessary actions with a specific
file, try to write a function that carries out those actions for that
same file.
Except when implementing the function, replace any specific data with
the value of an argument passed into the function.
e.g.
txt <- pdf_text("10619.pdf")
would be replaced by
txt <- pdf_text(pdfFile)

and your function would have pdfFile as an argument, as in

myfunc <- function( pdfFile )

Since you can accomplish the task for this file without a function,
you should be able to accomplish the task with a function.
Once you succeed to do that you can then try passing the function
arguments that refer to the other files you need to process.

HTH,
Eric


On Wed, Nov 20, 2019 at 1:09 AM Jeff Newmiller  wrote:
>
> Please don't spam the mailing list. Especially with HTML format messages. See 
> the Posting Guide.
>
> PDF is designed to present data graphically. It is literally possible to 
> place every character in the page in random order and still achieve this 
> visual readability while practically making it nearly impossible to read. I 
> have encountered many PDF files with the same text placed on the page 
> multiple times... again scrambling your option to read it digitally. Tools 
> like "pdftools" can sometimes work when the program that generated the file 
> does so in a simple and extraction-friendly way... but there are no 
> guarantees, and your description suggests that it is likely that you won't be 
> able to accomplish your goal with this file.
>
> On November 19, 2019 11:52:20 PM GMT+01:00, Thomas Subia via R-help 
>  wrote:
> >
> >Colleagues,
> >
> >
> >
> >I can extract specific data from lines in a pdf using:
> >
> >
> >
> >library(pdftools)
> >
> >pdf_text("10619.pdf")
> >
> >txt <- pdf_text(".pdf")
> >
> >write.table(txt,file="mydata.txt")
> >
> >con <- file('mydata.txt')
> >
> >open(con)
> >
> >serial <- read.table(con,skip=5,nrow=1) #Extract[3]flatness <-
> >read.table(con,skip=11,nrow=1)# Extract [5]
> >
> >parallel1 <-read.table(con,skip=2,nrow=1)# Extract [5]
> >
> >parallel2 <-read.table(con,skip=4,nrow=1)# Extract [5]
> >
> >close(con)
> >
> >
> >
> ># note here that serial has 4 variables
> >
> ># flatness had 6 variables
> >
> ># parallel1 has 5 variables
> >
> ># parallel2 has 5 variables
> >
> >
> >
> ># this outputs the specific data I need
> >
> >serial[3]
> >
> >flatness[5]
> >
> >parallel1[5] # Note here that the txt format shows 0.0007not
> >scientific, is there a way to format this to display the original data?
> >
> >parallel2[5] # Note here that the txt format shows 0.0006not
> >scientific, , is there a way to format this to display the original
> >data?
> >
> >
> >
> >I'd like to extend this code to all of the pdf files in adirectory and
> >to generate a table of all the serial, flatness, parallel1 andparallel2
> >data.
> >
> >I'm not having a lot of success trying to build thescript for this.
> >Some pointers would be appreciated.
> >All the best.
> >
> >Thomas Subia
> >
> >Statistician / Senior Quality Engineer
> >
> >
> >
> >      [[alternative HTML version deleted]]
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >http

[R] Extract lines from pdf files

2019-11-19 Thread Thomas Subia via R-help


Colleagues,

 

I can extract specific data from lines in a pdf using:

 

library(pdftools)

pdf_text("10619.pdf")

txt <- pdf_text(".pdf")

write.table(txt,file="mydata.txt")

con <- file('mydata.txt')

open(con)

serial <- read.table(con,skip=5,nrow=1) #Extract[3]flatness <- 
read.table(con,skip=11,nrow=1)# Extract [5]

parallel1 <-read.table(con,skip=2,nrow=1)# Extract [5]

parallel2 <-read.table(con,skip=4,nrow=1)# Extract [5]

close(con)

 

# note here that serial has 4 variables

# flatness had 6 variables

# parallel1 has 5 variables

# parallel2 has 5 variables

 

# this outputs the specific data I need

serial[3]

flatness[5]

parallel1[5] # Note here that the txt format shows 0.0007not scientific, is 
there a way to format this to display the original data?

parallel2[5] # Note here that the txt format shows 0.0006not scientific, , is 
there a way to format this to display the original data?

 

I'd like to extend this code to all of the pdf files in adirectory and to 
generate a table of all the serial, flatness, parallel1 andparallel2 data.

I'm not having a lot of success trying to build thescript for this. Some 
pointers would be appreciated.
All the best.
 
Thomas Subia

Statistician / Senior Quality Engineer



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sample size required to estimate population variance

2019-07-04 Thread Thomas Subia via R-help
 Thanks for the suggestion but I'm not sure that it answers my original 
question.I need to know how many samples I need to collect to collect in order 
to estimate the sample size needed to achieve a specific margin of error for 
confidence intervals for the population variance. I'm not sure whether the pwr 
package does that.


On Thursday, July 4, 2019, 4:31:44 PM PDT, John  wrote:  
 
 On Tue, 2 Jul 2019 22:23:18 + (UTC)
Thomas Subia via R-help  wrote:

> Colleagues,
> Can anyone suggest a package or code which might help me calculate
> the minimum sample size required to estimate the population variance?
> I can do this in Minitab but I'd rather do this in R. Thomas Subia

You want to look into the capabilities of the pwr package.  This page
offers some information:

https://www.statmethods.net/stats/power.html

JWDougherty
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sample size required to estimate population variance

2019-07-02 Thread Thomas Subia via R-help
Colleagues,
Can anyone suggest a package or code which might help me calculate the minimum 
sample size required to estimate the population variance? I can do this in 
Minitab but I'd rather do this in R.
Thomas Subia
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Post Hoc for Homogeneity of variance

2019-06-03 Thread Thomas Subia via R-help
Colleagues,
When using Levene's test, I can identify if there are any differences in 
variance between factors. This is straight forward
Is there a way to do a post hoc test to identify variance differences between 
factors? This is not so straight forward.
All the best
Thomas Subia



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] High p-value

2019-03-20 Thread Thomas Subia via R-help
>From previous posting:

"This is my function:

wilcox.test(A,B, data = data, paired = FALSE)

It gives me high p value, though the median of A column is 6900 and B
column is 3500.

Why it gives p value high if there is a difference in the median?"

Let's examine your choice to use the Wilcoxon test with paired = FALSE.


If both A and B are given and paired is FALSE, a Wilcoxon rank sum test 
(equivalent to the Mann-Whitney test) is carried out. In this case, the null 
hypothesis is that the distributions of x and y differ by a location shift of 
mu and the alternative is that they differ by some other location shift (and 
the one-sided alternative "greater" is that A is shifted to the right of B). 

You observed that the medians are different. While the medians may be 
different, the test is dependent on ranks not medians.

HTH,

Thomas Subia

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] About wilcox test

2019-03-12 Thread Thomas Subia via R-help
Javid wrote: 

"I have two set of data in excel:
A column( 16.38, -31, -16.77, 127, -57, 23.44 and so on)
B column ( -12, -59.23, -44, 34.23, 55.5, -12.12 and so on)

I run the wilcox test as :

wilcox.test(A , B, data = mydata, paired = FALSE)

I got always the p value very high, like 0.60

Even I make changes in the data, it gives me 0.7, 0.4 etc which is too high
than 0.05 and can not thus reject the null hypothesis.

What could be the problem as I know there is difference in the data?"
How did you conclude there is a difference in the data? 2-sample t-test?Some 
details and a data set would be helpful for someone to investigate this further.
Thomas SubiaStatisticianIMG Precision



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read_xl question

2019-01-23 Thread Thomas Subia via R-help


Colleagues,

 I have a workbook which has 3 worksheets

I need to extract data from two specific cells from one ofthose worksheets.

 

I can use read_excel to do this for one file.

data<-read_excel("C:/Desktop/Excel_raw_data/0020-49785 8768.xls",

sheet="Flow Data",range=("b9:c10"))

 

How can I do this for all my Excel files in the directory?

 

I can get the list of Excel files using: files =list.files(pattern="*.xls")

But I’m not sure where to go from here.

Some guidance would be appreciated.

 

All the best

 Thomas Subia

Thomas Subia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 error

2017-11-09 Thread Thomas Subia via R-help
Hello all,

 

Zeki(?) reported:

> ggplot(data = mtcars, aes(x= wt, y= mpg)) + geom_line()

> Error: Found object is not a stat.

 

Using R v3.4.62 and R studio, I'm unable to reproduce this error.

 

All the best,

Thomas Subia

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] difftime in years

2016-08-04 Thread Thomas Subia via R-help
Colleagues,

age_days <- difftime(Date,DOM,units="days")
date_vals$age_yrs <-  age_days/365.242

I'm trying to calculate the number of years between DOM and Date.
The output reads

 DOM   Date age_yrs
1 2005-04-04   2015-05-13   10.10563 days

How does one not output days?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] identifying outliers

2016-04-07 Thread Thomas Subia via R-help
Thanks for writing this great piece of code.

 

x = rnorm(100)

boxplot(x) # you shouldn't see any outliers here although sometimes yow will

 

# lets add some outliers intentionally

x = c(21, 20, 25, x) # now 10, 15 and 20 are outliers

 

myboxplot <- boxplot(x) # now you should see your three outliers

 

myboxplot$out # it will print the values of the outliers

 

How does one amend this code to produce the outliers by a group?

 

All the best,

 

Thomas Subia


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.