[R] connect MSSQL server problem

2023-03-17 Thread Kai Yang via R-help
Hi Team,I need to connect two MSSQL servers. One server ask me to enter user 
name and password. I can use dbConnect (from DBI package) to connect the 
serve.Another one do not ask me credential to access the serve. But when I 
remove UID and PWD, it doesn't work. Then I tried to use my PC log in 
credential for UID and PWD, it still doesn't allow me to access. The error 
message: Error: nanodbc/nanodbc.cpp:1021: 28000: [Microsoft][ODBC Driver 17 for 
SQL Server][SQL Server]Login failed for user '**'. How to modify the code 
(see below) to access second MSSQL server?
con <- dbConnect(odbc(),                 Driver   = "ODBC Driver 17 for SQL 
Server",                 Server   = "abcdefghijklmn",                 Database 
= "abc_def")
Thank you,Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem to run python code in r markdown

2023-01-20 Thread Kai Yang via R-help
Hi Team,I'm trying to run python in R markdown (flexdashboard). The code is 
below:

try Python=
```{r, include=FALSE, echo=TRUE}library(reticulate)py_install("numpy")
use_condaenv("base")
```
```{python}import numpy as np```

I got error message below:
Error in py_call_impl(callable, dots$args, dots$keywords) :   
ModuleNotFoundError: No module named 'numpy'Calls:  ... 
py_capture_output -> force ->  -> py_call_implIn addition: There 
were 26 warnings (use warnings() to see them)Execution halted


Based on message, the python can not find numpy package. But I'm sure I 
installed the package. I don't know how to fix the problem. please help
Thank you,Kai















[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] is it possible to run multiple rmd files together

2023-01-04 Thread Kai Yang via R-help
Hi Team,I have multiple rmd files (~50) for difference study report. I did try 
the source command to run them together, but it seems doesn't work.Is there a 
way to run those rmd files from one script?Thanks,Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] add specific fields in for loop

2022-11-15 Thread Kai Yang via R-help
ubject: Re: [R] add specific fields in for loop

 

Hello Bert and Avi,

Sorry, it is typo. it should be:

 

for (i in colnames(df)){
  ..
}

 

below is the code I'm currently using

 

try2.un$ab2 <-

 

  ifelse(grepl("ab2",try2.un$data1), try2.un$data1,

 

        ifelse(grepl("ab2",try2.un$data2), try2.un$data2,

 

                ifelse(grepl("ab2",try2.un$data3), try2.un$data3,

 

                      ifelse(grepl("ab2",try2.un$data4), try2.un$data4,

 

                              ifelse(grepl("ab2",try2.un$data5), 
try2.un$data5,NA

 

                              ) ) ) ) )

 

 

As you can see, it uses 5 fields (data1 -- 5 ) in ifelse function. I want to 
turn it to for loop, because the number of data(s) fields is dynamic. In this 
sample is 5, But it maybe more than 15 in some of situation. So, I want use 
loop to solve it and avoid to write those many ifelse statement. Also, in 
try2.un data frame, there are many other fields that I don't need to use in the 
loop. 

 

I'm not sure if the loop is a correct solution. But I'm willing to learn any 
more suggestion from you.

 

Thanks,

 

Kai

 

On Tuesday, November 15, 2022 at 09:23:03 AM PST, avi.e.gr...@gmail.com 
<mailto:avi.e.gr...@gmail.com>  mailto:avi.e.gr...@gmail.com> > wrote: 

 

 

Kai,

 

As Bert pointed out, it may not be clear what you want.

 

As a GUESS, you have some arbitrary data.frame object with multiple columns and 
you want to do something on selected columns. Consider changing your idea to be 
in several stages for simplicity and then optionally later rewriting it.

 

So step 1 is to get a vector of column names. The normal way to do this in base 
R is not with a function called columns(df) but colnames(df) ...

 

Step 2 is to use one of many techniques that take that vector of names and 
select the ones you want to keep. In base R there are many ways to do that 
including using regular expressions as in the "grep" family of functions. You 
may end up with a new vector of names perhaps shorter or in a different order.

 

Step 3 is to use those names in your loop. If you want say to convert a column 
from character to numeric, and your loop index is "current" you might write 
something like:

    df[current] <- as.numeric(df[current])

 

There are many ways and it depends on what exactly you want to do. There are 
packages designed to make some of these things fairly simple, such as dplyr 
where you can ask to match names that start or end a certain way or that are of 
certain types.

 

Avi

 

-Original Message-

From: R-help mailto:r-help-boun...@r-project.org> > On Behalf Of Kai Yang via R-help

Sent: Tuesday, November 15, 2022 11:18 AM

To: R-help Mailing List mailto:r-help@r-project.org> >

Subject: [R] add specific fields in for loop

 

Hi Team,

I can write a for loop like this:

for (i in columns(df)){

  ..

}

 

But it will working on all column in dataframe df. If I want to work on some of 
specific fields (say: the fields' name content 'date'), how should I modify the 
for loop? I changed the code below, but it doesn't work.

for (i in columns(df) %in% 'date' ){

  .

}

 

 

Thank you,

Kai

 

    [[alternative HTML version deleted]]

 

__

R-help@r-project.org <mailto:R-help@r-project.org>  mailing list -- To 
UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__

R-help@r-project.org <mailto:R-help@r-project.org>  mailing list -- To 
UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] add specific fields in for loop

2022-11-15 Thread Kai Yang via R-help
 Hi Rui,
This is a great help.
Thank you,
Kai
On Tuesday, November 15, 2022 at 11:48:12 AM PST, Rui Barradas 
 wrote:  
 
 Às 19:39 de 15/11/2022, Kai Yang escreveu:
> Hello Rui,
> Yes, it should be the one I want. I modified the code as:
> 
> for(i in grep("data", names(df))) {
>    try2.un$ab2 <-
>      ifelse(grepl("ab2",  [i]),   [i], NA)
> }
> 
> But I got error message:
> Error: unexpected '[' in:
> "  try2.un$ab2 <-
>      ifelse(grepl("ab2",["
> 
> I think I use [i] in a wrong way. Do you have any suggestion?
> Thanks,
> Kai  On Tuesday, November 15, 2022 at 10:54:08 AM PST, Rui Barradas 
>  wrote:
>  
>  Às 16:18 de 15/11/2022, Kai Yang via R-help escreveu:
>> Hi Team,
>> I can write a for loop like this:
>> for (i in columns(df)){
>>      ..
>> }
>>
>> But it will working on all column in dataframe df. If I want to work on some 
>> of specific fields (say: the fields' name content 'date'), how should I 
>> modify the for loop? I changed the code below, but it doesn't work.
>> for (i in columns(df) %in% 'date' ){
>>      .
>> }
>>
>>
>> Thank you,
>> Kai
>>
>>      [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> Hello,
> 
> Something like this?
> 
> 
> for(i in grep("date", names(df))) {
>    #
> }
> 
> 
> Hope this helps,
> 
> Rui Barradas
> 
>    
Hello,


For what I understand of the problem any of the two below does what the 
posted code seems to be trying to do. The first is better, no ifelse.


for(i in grep("data", names(df))) {
  is.na(try2.un$ab2) <- !grepl("ab2", df[[i]])
}

for(i in grep("data", names(df))) {
  try2.un$ab2 <- ifelse(grepl("ab2", df[[i]]), df[[i]], NA)
}


Hope this helps,

Rui Barradas

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] add specific fields in for loop

2022-11-15 Thread Kai Yang via R-help
Hello Rui,
Yes, it should be the one I want. I modified the code as:

for(i in grep("data", names(df))) {
  try2.un$ab2 <-
    ifelse(grepl("ab2",  [i]),   [i], NA)
}

But I got error message: 
Error: unexpected '[' in:
"  try2.un$ab2 <-
    ifelse(grepl("ab2",["

I think I use [i] in a wrong way. Do you have any suggestion?
Thanks,
Kai   On Tuesday, November 15, 2022 at 10:54:08 AM PST, Rui Barradas 
 wrote:  
 
 Às 16:18 de 15/11/2022, Kai Yang via R-help escreveu:
> Hi Team,
> I can write a for loop like this:
> for (i in columns(df)){
>    ..
> }
> 
> But it will working on all column in dataframe df. If I want to work on some 
> of specific fields (say: the fields' name content 'date'), how should I 
> modify the for loop? I changed the code below, but it doesn't work.
> for (i in columns(df) %in% 'date' ){
>    .
> }
> 
> 
> Thank you,
> Kai
> 
>     [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hello,

Something like this?


for(i in grep("date", names(df))) {
  #
}


Hope this helps,

Rui Barradas

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] add specific fields in for loop

2022-11-15 Thread Kai Yang via R-help
 Hello Bert and Avi,Sorry, it is typo. it should be:
for (i in colnames(df)){
  ..
}
below is the code I'm currently using
try2.un$ab2 <-
  ifelse(grepl("ab2",try2.un$data1), try2.un$data1,
         ifelse(grepl("ab2",try2.un$data2), try2.un$data2,
                ifelse(grepl("ab2",try2.un$data3), try2.un$data3,
                       ifelse(grepl("ab2",try2.un$data4), try2.un$data4,
                              ifelse(grepl("ab2",try2.un$data5), 
try2.un$data5,NA
                              ) ) ) ) )


As you can see, it uses 5 fields (data1 -- 5 ) in ifelse function. I want to 
turn it to for loop, because the number of data(s) fields is dynamic. In this 
sample is 5, But it maybe more than 15 in some of situation. So, I want use 
loop to solve it and avoid to write those many ifelse statement. Also, in 
try2.un data frame, there are many other fields that I don't need to use in the 
loop. 
I'm not sure if the loop is a correct solution. But I'm willing to learn any 
more suggestion from you.
Thanks,
Kai
On Tuesday, November 15, 2022 at 09:23:03 AM PST, avi.e.gr...@gmail.com 
 wrote:  
 
 Kai,

As Bert pointed out, it may not be clear what you want.

As a GUESS, you have some arbitrary data.frame object with multiple columns and 
you want to do something on selected columns. Consider changing your idea to be 
in several stages for simplicity and then optionally later rewriting it.

So step 1 is to get a vector of column names. The normal way to do this in base 
R is not with a function called columns(df) but colnames(df) ...

Step 2 is to use one of many techniques that take that vector of names and 
select the ones you want to keep. In base R there are many ways to do that 
including using regular expressions as in the "grep" family of functions. You 
may end up with a new vector of names perhaps shorter or in a different order.

Step 3 is to use those names in your loop. If you want say to convert a column 
from character to numeric, and your loop index is "current" you might write 
something like:
    df[current] <- as.numeric(df[current])

There are many ways and it depends on what exactly you want to do. There are 
packages designed to make some of these things fairly simple, such as dplyr 
where you can ask to match names that start or end a certain way or that are of 
certain types.

Avi

-Original Message-
From: R-help  On Behalf Of Kai Yang via R-help
Sent: Tuesday, November 15, 2022 11:18 AM
To: R-help Mailing List 
Subject: [R] add specific fields in for loop

Hi Team,
I can write a for loop like this:
for (i in columns(df)){
  ..
}

But it will working on all column in dataframe df. If I want to work on some of 
specific fields (say: the fields' name content 'date'), how should I modify the 
for loop? I changed the code below, but it doesn't work.
for (i in columns(df) %in% 'date' ){
  .
}


Thank you,
Kai

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] add specific fields in for loop

2022-11-15 Thread Kai Yang via R-help
Hi Team,
I can write a for loop like this:
for (i in columns(df)){
  ..
}

But it will working on all column in dataframe df. If I want to work on some of 
specific fields (say: the fields' name content 'date'), how should I modify the 
for loop? I changed the code below, but it doesn't work.
for (i in columns(df) %in% 'date' ){
  .
}


Thank you,
Kai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] automatic convert list to dataframe

2022-10-03 Thread Kai Yang via R-help
t;
>> list2env(file1, envir = .GlobalEnv)
>>
>>
>> will create data.frames dx1, dx2, etc, in the global environment.
>> If you really need the names file1_dx1, file1_dx2, etc, you can first
>> change the names
>>
>>
>> names(file1) <- paste("file1", names(file1), sep = "_")
>>
>>
>> and then run list2env like above.
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Às 16:51 de 03/10/2022, Kai Yang via R-help escreveu:
>>> Hi R team,
>>> I can use rio package to read excel file into R as a list. The excel file 
>>> content multiple sheets (30 - 40 data sheets). I can convert each data 
>>> elements into dataframe manually. I have multiple excel files with multiple 
>>> data sheets. I need to load them into R and do the comparison for same 
>>> sheet name from difference excel file. My current code is:
>>>      library(rio)   setwd ("C:/temp")
>>> filenames <- gsub("\\.xlsx$","", list.files(pattern="\\.xlsx$"))
>>> for(i in filenames){
>>>        assign(i, import_list(paste0(i, ".xlsx", sep="")))
>>> }
>>> file1_dx1     <-  file1[["dx1"]]
>>>
>>> file1_dx2     <-  file1[["dx2"]]
>>>
>>> file1_dx3     <-  file1[["dx3"]]
>>>
>>> file2_dx1     <-  file1[["dx1"]]
>>>
>>> file2_dx2     <-  file1[["dx2"]]
>>> ..
>>>
>>> I hope the code can automatic converting the list (may have 30 - 40 lists) 
>>> by adding file name (such as: filename_sheetname) and put it in for loop
>>>
>>>
>>> Thank you,
>>> Kai
>>>
>>>
>>>
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>      
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] automatic convert list to dataframe

2022-10-03 Thread Kai Yang via R-help
 Hi Rui,
I copied "list2env(i, envir = .GlobalEnv)" to the code, but I got the same 
error message of "first argument must be a named list". Maybe list2env cannot 
put in loop? the code works very well outside of for loop.
One more thing, the difference file may have same sheet name. that's why I want 
to add file name in front of sheet name to avoid overwriting. It still works 
well outside of loop, but doesn't work in loop. I don't know how to fix the 
problems.
Thank you,
Kai

On Monday, October 3, 2022 at 12:09:04 PM PDT, Rui Barradas 
 wrote:  
 
 Hello,

If in each iteration i is a list, try removing the call to names().
Try, in the loop,


list2env(i, envir = .GlobalEnv)


The error message is telling that list2env's first argument must be a 
named list and names(i) is an unnamed vector, it's i that's the named 
list (you even changed its names in the previous instruction).

Hope this helps,

Rui Barradas

Às 18:38 de 03/10/2022, Kai Yang escreveu:
>  Hi Rui,
> list2env(file1, envir = .GlobalEnv) is worked very well. Thank you.
> 
> But when I tried to put the sample code  into for loop. I got error message:
> for(i in filenames){
>    assign(i, import_list(paste0(i, ".xlsx", sep="")))
>    names(i) <- paste(i, names(i), sep = "_")
>    list2env(names(i), envir = .GlobalEnv)
> }
> Error in list2env(names(i), envir = .GlobalEnv) :   first argument must be a 
> named list
> 
> It seems I cannot put names(i) into for loop, Could you please help me to 
> debug it?
> Thank you,Kai    On Monday, October 3, 2022 at 10:14:25 AM PDT, Rui Barradas 
>  wrote:
>  
>  Hello,
> 
> 
> list2env(file1, envir = .GlobalEnv)
> 
> 
> will create data.frames dx1, dx2, etc, in the global environment.
> If you really need the names file1_dx1, file1_dx2, etc, you can first
> change the names
> 
> 
> names(file1) <- paste("file1", names(file1), sep = "_")
> 
> 
> and then run list2env like above.
> 
> Hope this helps,
> 
> Rui Barradas
> 
> Às 16:51 de 03/10/2022, Kai Yang via R-help escreveu:
>> Hi R team,
>> I can use rio package to read excel file into R as a list. The excel file 
>> content multiple sheets (30 - 40 data sheets). I can convert each data 
>> elements into dataframe manually. I have multiple excel files with multiple 
>> data sheets. I need to load them into R and do the comparison for same sheet 
>> name from difference excel file. My current code is:
>>    library(rio)   setwd ("C:/temp")
>> filenames <- gsub("\\.xlsx$","", list.files(pattern="\\.xlsx$"))
>> for(i in filenames){
>>      assign(i, import_list(paste0(i, ".xlsx", sep="")))
>> }
>> file1_dx1     <-  file1[["dx1"]]
>>
>> file1_dx2     <-  file1[["dx2"]]
>>
>> file1_dx3     <-  file1[["dx3"]]
>>
>> file2_dx1     <-  file1[["dx1"]]
>>
>> file2_dx2     <-  file1[["dx2"]]
>> ..
>>
>> I hope the code can automatic converting the list (may have 30 - 40 lists) 
>> by adding file name (such as: filename_sheetname) and put it in for loop
>>
>>
>> Thank you,
>> Kai
>>
>>
>>
>>
>>      [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>    
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] automatic convert list to dataframe

2022-10-03 Thread Kai Yang via R-help
 Hi Rui,
list2env(file1, envir = .GlobalEnv) is worked very well. Thank you.

But when I tried to put the sample code  into for loop. I got error message:
for(i in filenames){
  assign(i, import_list(paste0(i, ".xlsx", sep="")))
  names(i) <- paste(i, names(i), sep = "_")
  list2env(names(i), envir = .GlobalEnv)
}
Error in list2env(names(i), envir = .GlobalEnv) :   first argument must be a 
named list

It seems I cannot put names(i) into for loop, Could you please help me to debug 
it?
Thank you,KaiOn Monday, October 3, 2022 at 10:14:25 AM PDT, Rui Barradas 
 wrote:  
 
 Hello,


list2env(file1, envir = .GlobalEnv)


will create data.frames dx1, dx2, etc, in the global environment.
If you really need the names file1_dx1, file1_dx2, etc, you can first 
change the names


names(file1) <- paste("file1", names(file1), sep = "_")


and then run list2env like above.

Hope this helps,

Rui Barradas

Às 16:51 de 03/10/2022, Kai Yang via R-help escreveu:
> Hi R team,
> I can use rio package to read excel file into R as a list. The excel file 
> content multiple sheets (30 - 40 data sheets). I can convert each data 
> elements into dataframe manually. I have multiple excel files with multiple 
> data sheets. I need to load them into R and do the comparison for same sheet 
> name from difference excel file. My current code is:
>  library(rio)   setwd ("C:/temp")
> filenames <- gsub("\\.xlsx$","", list.files(pattern="\\.xlsx$"))
> for(i in filenames){
>    assign(i, import_list(paste0(i, ".xlsx", sep="")))
> }
> file1_dx1     <-  file1[["dx1"]]
> 
> file1_dx2     <-  file1[["dx2"]]
> 
> file1_dx3     <-  file1[["dx3"]]
> 
> file2_dx1     <-  file1[["dx1"]]
> 
> file2_dx2     <-  file1[["dx2"]]
> ..
> 
> I hope the code can automatic converting the list (may have 30 - 40 lists) by 
> adding file name (such as: filename_sheetname) and put it in for loop
> 
> 
> Thank you,
> Kai
> 
> 
> 
> 
>     [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] automatic convert list to dataframe

2022-10-03 Thread Kai Yang via R-help
Hi R team,
I can use rio package to read excel file into R as a list. The excel file 
content multiple sheets (30 - 40 data sheets). I can convert each data elements 
into dataframe manually. I have multiple excel files with multiple data sheets. 
I need to load them into R and do the comparison for same sheet name from 
difference excel file. My current code is:
 library(rio)   setwd ("C:/temp")
filenames <- gsub("\\.xlsx$","", list.files(pattern="\\.xlsx$"))
for(i in filenames){
  assign(i, import_list(paste0(i, ".xlsx", sep="")))
}
file1_dx1     <-  file1[["dx1"]]

file1_dx2     <-  file1[["dx2"]]

file1_dx3     <-  file1[["dx3"]]

file2_dx1     <-  file1[["dx1"]]

file2_dx2     <-  file1[["dx2"]]
..

I hope the code can automatic converting the list (may have 30 - 40 lists) by 
adding file name (such as: filename_sheetname) and put it in for loop


Thank you,
Kai




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rename files in R

2022-09-19 Thread Kai Yang via R-help
 Thank you for your help. I'll check this out.Kai
On Monday, September 19, 2022 at 11:47:11 AM PDT, Bert Gunter 
 wrote:  
 
 I haven't tracked what went before, but your syntax here is totally messed up:
for (i in seq_len(nrow(total_name)))
{
  with(
    total_name[i, ], 
    {
      file.rename(total_name$orig, total_name$target)
    }
  )
}
Also, you don't need looping because file.rename is already vectorized -- you 
need to read the Help file more carefully. So something like this should do 
what you want:
with(total_name, file.rename (orig, target))
where orig and target should be character (not factor) columns in your data 
frame.

Cheers,Bert     
On Mon, Sep 19, 2022 at 10:38 AM Kai Yang via R-help  
wrote:

 Hi Rui,I put original file names and target file names into data frame: 
total_name. I can use
file.rename("out_1.pdf", "abc_title.pdf")

to rename a single file. But I hope to use loop to complete the task. So I 
write  a code below:
for (i in seq_len(nrow(total_name)))
{
  with(
    total_name[i, ], 
    {
      file.rename(total_name$orig, total_name$target)
    }
  )
}     
But I got error message: 
Error in file.rename(total_name$orig, total_name$target) : 
invalid 'from' argument

I think the problem happen due to quotation problem. Do you know how to fix it?
Thanks,
Kai    On Friday, September 16, 2022 at 08:32:53 PM PDT, Rui Barradas 
 wrote:  

 Hello,

My understanding of the problem is different, the files' first row is 
not tabular data, I might be wrong but it seems to me that it's 
something like

first row [of] file1.txt [is]:
abc.txt

file2.txt:
bed.txt

etc.

That's why the sapply loop reads one datum only and exits.

Hope this helps,

Rui Barradas

Às 02:29 de 17/09/2022, Ebert,Timothy Aaron escreveu:
> The syntax might not be quite right, but why not something like 
> file.rename(colnames(dataframe)[1]) -- Using colnames to get the names of the 
> columns that are in the first row, selecting the first element from 
> colnames() and setting the file name equal to that.
> 
> Do a for loop using current file names in some folder, and save to a new 
> folder.
> 
> Tim
> 
> -----Original Message-
> From: R-help  On Behalf Of Kai Yang via R-help
> Sent: Friday, September 16, 2022 1:52 PM
> To: R-help Mailing List ; Rui Barradas 
> 
> Subject: Re: [R] rename files in R
> 
> [External Email]
> 
>  Hello,Here is the example:
>    file name    first row  file1.txt    abc.txt  file2.txt    bed.txt  
>file3.txt    gogo.txt  . . file1243.txt    last.txt
> I want to use loop because I need to read the first row information for first 
> file, and then rename the file, then go to next file. I'm not sure if this is 
> right way to approach my goal. Any suggestion will be appreciated. Thanks, 
> Kai    On Friday, September 16, 2022 at 10:38:32 AM PDT, Rui Barradas 
>  wrote:
> 
>  Hello,
> 
> Please post the first row of 2 or 3 files and the expected result.
> 
> You can rename files with ?file.rename. This function is vectorized its on 
> arguments so you do not need a loop, only the source and destination 
> filenames. Both vectors should have the same length, if not strange things 
> will occur including data loss.
> 
> Hope this helps,
> 
> Rui Barradas
> 
> Às 18:26 de 16/09/2022, Kai Yang via R-help escreveu:
>> Hello,I have a lot of files with not meaningful name, such as:
>> file1.txt, file2.txt .. I need to rename them using the
>> information from the first row of the files. Now I can get the
>> information from the first row of each file. Now, I need know how to
>> rename them in R (using loop?). Thank you for your helpKai
>>
>>
>>      [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
>> .ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=05%7C01%7Ctebert%40ufl
>> .edu%7C559ca95ca7d34937799a08da980c3995%7C0d4da0f84a314d76ace60a62331e
>> 1b84%7C0%7C0%7C637989475519163868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
>> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
>> sdata=lPW2XrQntw3V7eJmxkaoOXK75%2FO1hWgRv7FO3%2B5O3OQ%3Drese
>> rved=0 PLEASE do read the posting guide
>> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r
>> -project.org%2Fposting-guide.htmldata=05%7C01%7Ctebert%40ufl.edu%
>> 7C559ca95ca7d34937799a08da980c3995%7C0d4da0f84a314d76ace60a62331e1b84%
>> 7C0%7C0%7C637989475519320101%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
>> DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
>> sdata=%2FaSxvb47qFp7Mo6MFlCEQcvinHvV%2FWA%2Brs3keoNuTqk%3Dreserve
>

Re: [R] rename files in R

2022-09-19 Thread Kai Yang via R-help
 Hi Rui,I put original file names and target file names into data frame: 
total_name. I can use
file.rename("out_1.pdf", "abc_title.pdf")

to rename a single file. But I hope to use loop to complete the task. So I 
write  a code below:
for (i in seq_len(nrow(total_name)))
{
  with(
    total_name[i, ], 
    {
      file.rename(total_name$orig, total_name$target)
    }
  )
}     
But I got error message: 
Error in file.rename(total_name$orig, total_name$target) : 
invalid 'from' argument

I think the problem happen due to quotation problem. Do you know how to fix it?
Thanks,
KaiOn Friday, September 16, 2022 at 08:32:53 PM PDT, Rui Barradas 
 wrote:  
 
 Hello,

My understanding of the problem is different, the files' first row is 
not tabular data, I might be wrong but it seems to me that it's 
something like

first row [of] file1.txt [is]:
abc.txt

file2.txt:
bed.txt

etc.

That's why the sapply loop reads one datum only and exits.

Hope this helps,

Rui Barradas

Às 02:29 de 17/09/2022, Ebert,Timothy Aaron escreveu:
> The syntax might not be quite right, but why not something like 
> file.rename(colnames(dataframe)[1]) -- Using colnames to get the names of the 
> columns that are in the first row, selecting the first element from 
> colnames() and setting the file name equal to that.
> 
> Do a for loop using current file names in some folder, and save to a new 
> folder.
> 
> Tim
> 
> -----Original Message-
> From: R-help  On Behalf Of Kai Yang via R-help
> Sent: Friday, September 16, 2022 1:52 PM
> To: R-help Mailing List ; Rui Barradas 
> 
> Subject: Re: [R] rename files in R
> 
> [External Email]
> 
>  Hello,Here is the example:
>    file name    first row  file1.txt    abc.txt  file2.txt    bed.txt  
>file3.txt    gogo.txt  . . file1243.txt    last.txt
> I want to use loop because I need to read the first row information for first 
> file, and then rename the file, then go to next file. I'm not sure if this is 
> right way to approach my goal. Any suggestion will be appreciated. Thanks, 
> Kai    On Friday, September 16, 2022 at 10:38:32 AM PDT, Rui Barradas 
>  wrote:
> 
>  Hello,
> 
> Please post the first row of 2 or 3 files and the expected result.
> 
> You can rename files with ?file.rename. This function is vectorized its on 
> arguments so you do not need a loop, only the source and destination 
> filenames. Both vectors should have the same length, if not strange things 
> will occur including data loss.
> 
> Hope this helps,
> 
> Rui Barradas
> 
> Às 18:26 de 16/09/2022, Kai Yang via R-help escreveu:
>> Hello,I have a lot of files with not meaningful name, such as:
>> file1.txt, file2.txt .. I need to rename them using the
>> information from the first row of the files. Now I can get the
>> information from the first row of each file. Now, I need know how to
>> rename them in R (using loop?). Thank you for your helpKai
>>
>>
>>      [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat
>> .ethz.ch%2Fmailman%2Flistinfo%2Fr-helpdata=05%7C01%7Ctebert%40ufl
>> .edu%7C559ca95ca7d34937799a08da980c3995%7C0d4da0f84a314d76ace60a62331e
>> 1b84%7C0%7C0%7C637989475519163868%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w
>> LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
>> sdata=lPW2XrQntw3V7eJmxkaoOXK75%2FO1hWgRv7FO3%2B5O3OQ%3Drese
>> rved=0 PLEASE do read the posting guide
>> https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r
>> -project.org%2Fposting-guide.htmldata=05%7C01%7Ctebert%40ufl.edu%
>> 7C559ca95ca7d34937799a08da980c3995%7C0d4da0f84a314d76ace60a62331e1b84%
>> 7C0%7C0%7C637989475519320101%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM
>> DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C
>> sdata=%2FaSxvb47qFp7Mo6MFlCEQcvinHvV%2FWA%2Brs3keoNuTqk%3Dreserve
>> d=0 and provide commented, minimal, self-contained, reproducible code.
> 
>          [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu%7C559ca95ca7d34937799a08da980c3995%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C637989475519320101%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=tCreTbW2QEMePdooZNVmtpl6kKSVMXv9E0oIBYBTMBQ%3D=0
> PLEASE do read the posting guide 
> https://nam10.sa

Re: [R] rename files in R

2022-09-16 Thread Kai Yang via R-help
 Thank you. I'll try this. --- Kai
On Friday, September 16, 2022 at 11:01:33 AM PDT, Rui Barradas 
 wrote:  
 
 Hello,

Something like the following might work.



filenames <- list.files(pattern = "^file\\d+\\.txt$")
destnames <- sapply(filenames, scan, what = character(), sep = "\n", n = 
1L, strip.white = TRUE)

if(length(filenames) == length(destnames))
  file.rename(filenames, destnames)
else message("something went wrong")



But I would make copies of 2 or 3 files first and test with, say,

filenames[1:2] and destnames[1:2]


Hope this helps,

Rui Barradas

Às 18:51 de 16/09/2022, Kai Yang escreveu:
>  Hello,Here is the example:
>    file name    first row  file1.txt     abc.txt  file2.txt     bed.txt  
>file3.txt     gogo.txt  . . file1243.txt    last.txt
> I want to use loop because I need to read the first row information for first 
> file, and then rename the file, then go to next file. I'm not sure if this is 
> right way to approach my goal. Any suggestion will be appreciated. Thanks, 
> Kai    On Friday, September 16, 2022 at 10:38:32 AM PDT, Rui Barradas 
>  wrote:
>  
>  Hello,
> 
> Please post the first row of 2 or 3 files and the expected result.
> 
> You can rename files with ?file.rename. This function is vectorized its
> on arguments so you do not need a loop, only the source and destination
> filenames. Both vectors should have the same length, if not strange
> things will occur including data loss.
> 
> Hope this helps,
> 
> Rui Barradas
> 
> Às 18:26 de 16/09/2022, Kai Yang via R-help escreveu:
>> Hello,I have a lot of files with not meaningful name, such as:  file1.txt, 
>> file2.txt .. I need to rename them using the information from the first 
>> row of the files. Now I can get the information from the first row of each 
>> file. Now, I need know how to rename them in R (using loop?). Thank you for 
>> your helpKai
>>
>>
>>      [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>    
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rename files in R

2022-09-16 Thread Kai Yang via R-help
Hello,I have a lot of files with not meaningful name, such as:  file1.txt, 
file2.txt .. I need to rename them using the information from the first row 
of the files. Now I can get the information from the first row of each file. 
Now, I need know how to rename them in R (using loop?). Thank you for your 
helpKai


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] function problem: multi selection in one argument

2022-01-24 Thread Kai Yang via R-help
Hello Team,
I can run the function below:

library(tidyverse)

f2 <- function(indata, subgrp1){
  indata0 <- indata
  temp    <- indata0 %>% select({{subgrp1}}) %>% arrange({{subgrp1}}) %>% 
    group_by({{subgrp1}}) %>%
    mutate(numbering =row_number(), max=max(numbering))
  view(temp)
  f_table <- table(temp$Species)
  view(f_table)
  return(f_table)
}
f2(iris, Species)

You can see the second argument I use Species only, and it works fine. 
But If I say, I want the 2nd argument = Petal.Width, Species , how should I 
write the argument? I did try f2(iris, c(Petal.Width, Species)), but I got 
error message:
Error: arrange() failed at implicit mutate() step. 
* Problem with `mutate()` column `..1`.
i `..1 = c(Petal.Width, Species)`.
i `..1` must be size 150 or 1, not 300.

I'm not sure how to fix the problem either in function or can fix it when using 
the function.
Thank you,
Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to find the table in R studio

2022-01-12 Thread Kai Yang via R-help
Hi all,
I created a function in R. It will be generate a table "temp". I can view it in 
R studio, but I cannot find it on the top right window in R studio. Can someone 
tell me how to find it in there? Same thing for f_table. 
Thank you,
Kai
library(tidyverse)

f1 <- function(indata , subgrp1){
  subgrp1 <- enquo(subgrp1)
  indata0 <- indata
  temp    <- indata0 %>% select(!!subgrp1) %>% arrange(!!subgrp1) %>% 
    group_by(!!subgrp1) %>%
    mutate(numbering =row_number(), max=max(numbering))
  view(temp)
  f_table <- table(temp$Species)
  view(f_table)
}

f1(iris, Species)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about error message: "Aesthetics must be either length 1 or the same as the data (226): y and colour"

2021-12-30 Thread Kai Yang via R-help
 Thank you all of your help. I'm new in R. I'll follow Bert's and your 
suggestions to post the question in another area in future. 
Happy New Year
Kai
On Thursday, December 30, 2021, 02:05:12 PM PST, CALUM POLWART 
 wrote:  
 
 You will get shot down in flames for posting this here. This list is for 
base-R and your query is tidyverse related so belongs on somewhere tidy 
specific...
But...
mpg[[y]]
Is returning the column mpg[[y]] unfiltered but displ is filtered.
It would be possible to use mpg[[y]][ mpg$hwy<35]
But I suspect there is a better solution involving . As a data source or 
perhaps select to pick only the data you want...

On 30 Dec 2021 18:42, Kai Yang via R-help  wrote:


Hi R team,
I can create a plot using the code below:
library(ggplot2)
library(dplyr)
mpg %>%
  filter(hwy <35) %>% 
  ggplot(aes(x = displ, y = hwy, color = cyl)) + 
  geom_point()
ggsave("c:/temp/hwy_cyl.jpg",width = 9, height = 6, dpi = 1200, units = "in")


I want to do the exactly same work using function. Below is the function I 
created:
plot1 <- function(y, c, f){
          mpg %>%
            filter(hwy <35) %>% 
            ggplot(aes(x = displ, y = mpg[[y]], color = mpg[[c]] )) + 
            geom_point()  
          ggsave(paste0("c:/temp/", f ,".jpg"), width = 9, height = 6, dpi = 
1200, units = "in")  
} 
plot1("hwy","cyl","hwy_cyl_f")

But I got error message when I run the code: "Aesthetics must be either length 
1 or the same as the data (226): y and colour" . I checked online about the 
message. My understanding is: I need to add "fill" in geom_point() statement. 
My questions are:
1. is it possible to make the code work without add 'fill' in geom_point() 
statement, but keep the color as same as the first code? 
2. if I must add 'fill' option in geom_point() statement, how to add them in? 
Should I add 266 colors name after 'fill'?
3. this is my first function in R, I'm sure there are many problems in the 
code. please point out my error in the code.


Thank you,
Kai

 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question about error message: "Aesthetics must be either length 1 or the same as the data (226): y and colour"

2021-12-30 Thread Kai Yang via R-help
Hi R team,
I can create a plot using the code below:
library(ggplot2)
library(dplyr)
mpg %>%
  filter(hwy <35) %>% 
  ggplot(aes(x = displ, y = hwy, color = cyl)) + 
  geom_point()
ggsave("c:/temp/hwy_cyl.jpg",width = 9, height = 6, dpi = 1200, units = "in")


I want to do the exactly same work using function. Below is the function I 
created:
plot1 <- function(y, c, f){
          mpg %>%
            filter(hwy <35) %>% 
            ggplot(aes(x = displ, y = mpg[[y]], color = mpg[[c]] )) + 
            geom_point()  
          ggsave(paste0("c:/temp/", f ,".jpg"), width = 9, height = 6, dpi = 
1200, units = "in")  
} 
plot1("hwy","cyl","hwy_cyl_f")

But I got error message when I run the code: "Aesthetics must be either length 
1 or the same as the data (226): y and colour" . I checked online about the 
message. My understanding is: I need to add "fill" in geom_point() statement. 
My questions are:
1. is it possible to make the code work without add 'fill' in geom_point() 
statement, but keep the color as same as the first code? 
2. if I must add 'fill' option in geom_point() statement, how to add them in? 
Should I add 266 colors name after 'fill'?
3. this is my first function in R, I'm sure there are many problems in the 
code. please point out my error in the code.


Thank you,
Kai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about for loop

2021-12-24 Thread Kai Yang via R-help
 Thanks Avi. This is a good idea. I'm learning how to create a function in R 
now and may have more questions for you. I really apricate all of your help. 
Wish you have a great holiday and happy new year !Kai
On Friday, December 24, 2021, 04:54:36 PM PST, Avi Gross via R-help 
 wrote:  
 
 Suggestion to consider another approach:

Once various errors are fixed, the program being done basically sounds like
you want to repeat a sequence of actions one per ROW of a data.frame. As it
happens, the action to perform is based on a second data.frame given to
ggplot, with parts dynamically inserted from the variables in one row.

So consider making a function with a name like graph_it() that does what you
want when passed three named arguments. I mean it takes arguments with names
like alpha, beta and gamma and then use the pmap() function (part of the
tidyverse unfortunately in the purr package) along with the function you
want:

Typing:

pmap(.l=mac2, .f=graph_it)

Will implicitly perform your functionality one row after another without an
explicit loop and often faster than nested loops would be. I have used the
technique to replace a deeply nested loop that generates all combinations of
multiple categorical variables (about a million) into a data.frame, then do
something with each one fairly efficiently.

If nothing else, it would make this problem a tad simpler by not needing
subscripts.

-Original Message-
From: R-help  On Behalf Of Andrew Simmons
Sent: Friday, December 24, 2021 5:37 PM
To: Kai Yang 
Cc: R-help Mailing List 
Subject: Re: [R] question about for loop

y, c, and f only exist in the context of mac2 If you want to use them,
you'll have to write mac2$y, mac2$c, or mac2$f (or the [[ versions
mac2[["y"]], mac2[["c"]], or mac2[["f"]]) Combining that with index i would
then look like mac2$y[[i]] or mac2[[i, "y"]]

Also, I think you want to use aes_string instead of aes (since you want
those expressions within aes to be evaluated) Something like this seems to
work for me:


`%>%` <- magrittr::`%>%`


writeLines(FILE <- tempfile(), text =
r"{y,c,f
hwy,cyl,hwy_cyl2
cty,class,cty_class2}")


mac2 <- readr::read_csv(FILE)
for (i in seq_len(nrow(mac2))) {
    ggplt <- ggplot2::mpg %>%
        dplyr::filter(hwy < 35) %>%
        ggplot2::ggplot(
            ggplot2::aes_string(
                x = "displ",
                y = mac2[[i, "y"]],
                color = mac2[[i, "c"]]
            )
        ) +
        ggplot2::geom_point() +
        ggplot2::ylab(mac2[[i, "y"]]) +
        ggplot2::guides(
            color = ggplot2::guide_legend(title = mac2[[i, "c"]])
        )
    ggplot2::ggsave(
        filename = tempfile(
            mac2[[i, "f"]],
            fileext = ".jpg"
        ),
        plot = ggplt,
        width = 9, height = 6, dpi = 1200
    )
}


unlink(FILE)


runs fine on my computer, but might look more like this for you:


library(magrittr)
library(ggplot2)
library(dplyr)
library(readr)


mac2 <- read_csv("C:/temp/mac2.csv")
for (i in seq_len(nrow(mac2))) {
    ggplt <- mpg %>%
        filter(hwy < 35) %>%
        ggplot(
            aes_string(
                x = "displ",
                y = mac2[[i, "y"]],
                color = mac2[[i, "c"]]
            )
        ) +
        geom_point() +
        ylab(mac2[[i, "y"]]) +
        guides(
            color = guide_legend(title = mac2[[i, "c"]])
        )
    ggsave(
        filename = paste0("C:/temp/", mac2[[i, "f"]], ".jpg"),
        plot = ggplt,
        width = 9, height = 6, dpi = 1200
    )
}


try reading through aes and aes_string, and keep in mind that columns in
data frames aren't R variables (where they are in Excel). If you want to use
columns like they are variables, you can try using `with`. For example:


library(magrittr)
library(ggplot2)
library(dplyr)
library(readr)


mac2 <- read_csv("C:/temp/mac2.csv")
for (i in seq_len(nrow(mac2))) {
    with(mac2[i, ], {
        ggplt <- mpg %>%
            filter(hwy < 35) %>%
            ggplot(
                aes_string(
                    x = "displ",
                    y = y,
                    color = c
                )
            ) +
            geom_point() +
            ylab(y) +
            guides(
                color = guide_legend(title = c)
            )
        ggsave(
            filename = paste0("C:/temp/", f, ".jpg"),
            plot = ggplt,
            width = 9, height = 6, dpi = 1200
        )
    })
}




On Fri, Dec 24, 2021 at 4:48 PM Kai Yang via R-help 
wrote:

> Hello Team,
> I create a csv file (mac2) to save parameter values. the file looks like:
>
> y,c,f
> hwy,cyl,hwy_cyl2
> cty,class,cty_class2
>
&g

Re: [R] question about for loop

2021-12-24 Thread Kai Yang via R-help
 Thanks Andrew. This is super helpful. --- Kai
On Friday, December 24, 2021, 02:37:14 PM PST, Andrew Simmons 
 wrote:  
 
 y, c, and f only exist in the context of mac2If you want to use them, you'll 
have to write mac2$y, mac2$c, or mac2$f (or the [[ versions mac2[["y"]], 
mac2[["c"]], or mac2[["f"]])Combining that with index i would then look like 
mac2$y[[i]] or mac2[[i, "y"]]
Also, I think you want to use aes_string instead of aes (since you want those 
expressions within aes to be evaluated)Something like this seems to work for me:

`%>%` <- magrittr::`%>%`


writeLines(FILE <- tempfile(), text =
r"{y,c,f
hwy,cyl,hwy_cyl2
cty,class,cty_class2}")


mac2 <- readr::read_csv(FILE)
for (i in seq_len(nrow(mac2))) {
    ggplt <- ggplot2::mpg %>%
        dplyr::filter(hwy < 35) %>%
        ggplot2::ggplot(
            ggplot2::aes_string(
                x = "displ",
                y = mac2[[i, "y"]],
                color = mac2[[i, "c"]]
            )
        ) +
        ggplot2::geom_point() +
        ggplot2::ylab(mac2[[i, "y"]]) +
        ggplot2::guides(
            color = ggplot2::guide_legend(title = mac2[[i, "c"]])
        )
    ggplot2::ggsave(
        filename = tempfile(
            mac2[[i, "f"]],
            fileext = ".jpg"
        ),
        plot = ggplt,
        width = 9, height = 6, dpi = 1200
    )
}


unlink(FILE)


runs fine on my computer, but might look more like this for you:

library(magrittr)
library(ggplot2)
library(dplyr)
library(readr)


mac2 <- read_csv("C:/temp/mac2.csv")
for (i in seq_len(nrow(mac2))) {
    ggplt <- mpg %>%
        filter(hwy < 35) %>%
        ggplot(
            aes_string(
                x = "displ",
                y = mac2[[i, "y"]],
                color = mac2[[i, "c"]]
            )
        ) +
        geom_point() +
        ylab(mac2[[i, "y"]]) +
        guides(
            color = guide_legend(title = mac2[[i, "c"]])
        )
    ggsave(
        filename = paste0("C:/temp/", mac2[[i, "f"]], ".jpg"),
        plot = ggplt,
        width = 9, height = 6, dpi = 1200
    )
}


try reading through aes and aes_string, and keep in mind that columns in data 
frames aren't R variables (where they are in Excel). If you want to use columns 
like they are variables, you can try using `with`. For example:

library(magrittr)
library(ggplot2)
library(dplyr)
library(readr)


mac2 <- read_csv("C:/temp/mac2.csv")
for (i in seq_len(nrow(mac2))) {
    with(mac2[i, ], {
        ggplt <- mpg %>%
            filter(hwy < 35) %>%
            ggplot(
                aes_string(
                    x = "displ",
                    y = y,
                    color = c
                )
            ) +
            geom_point() +
            ylab(y) +
            guides(
                color = guide_legend(title = c)
            )
        ggsave(
            filename = paste0("C:/temp/", f, ".jpg"),
            plot = ggplt,
            width = 9, height = 6, dpi = 1200
        )
    })
}




On Fri, Dec 24, 2021 at 4:48 PM Kai Yang via R-help  
wrote:

Hello Team,
I create a csv file (mac2) to save parameter values. the file looks like:

y,c,f
hwy,cyl,hwy_cyl2
cty,class,cty_class2

Then I load the file into R and apply the parameters y, c, f in for loop, see 
my code below:
library(ggplot2)
library(tidyverse)
library(readr)
mac2 <- read_csv("C:/temp/mac2.csv")
View(mac2)
for (i in seq(nrow(mac2))){
  mpg %>%
    filter(hwy <35) %>% 
    ggplot(aes(x = displ, y = get(y[i]), color = get(c[i]) )) + 
    geom_point()+
    ylab(y[i]) +                              
    guides(color = guide_legend(title = c[i]))    
ggsave(paste0("c:/temp/",f[i],".jpg"),width = 9, height = 6, dpi = 1200, units 
= "in")
}

but I got an error message: "Error in dots_list(..., title = title, subtitle = 
subtitle, caption = caption,  :  object 'y' not found"
Does anyone know how to fix the problem?
Thanks,
Kai


        [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question about for loop

2021-12-24 Thread Kai Yang via R-help
Hello Team,
I create a csv file (mac2) to save parameter values. the file looks like:

y,c,f
hwy,cyl,hwy_cyl2
cty,class,cty_class2

Then I load the file into R and apply the parameters y, c, f in for loop, see 
my code below:
library(ggplot2)
library(tidyverse)
library(readr)
mac2 <- read_csv("C:/temp/mac2.csv")
View(mac2)
for (i in seq(nrow(mac2))){
  mpg %>%
    filter(hwy <35) %>% 
    ggplot(aes(x = displ, y = get(y[i]), color = get(c[i]) )) + 
    geom_point()+
    ylab(y[i]) +                              
    guides(color = guide_legend(title = c[i]))    
ggsave(paste0("c:/temp/",f[i],".jpg"),width = 9, height = 6, dpi = 1200, units 
= "in")
}

but I got an error message: "Error in dots_list(..., title = title, subtitle = 
subtitle, caption = caption,  :  object 'y' not found"
Does anyone know how to fix the problem?
Thanks,
Kai


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop question in R

2021-12-22 Thread Kai Yang via R-help
 Hi Rui and Ivan,Thank you explain of the code for me in detail. This is very 
helpful. And the code works well now.Happy Holiday,Kai
On Wednesday, December 22, 2021, 02:30:49 PM PST, Rui Barradas 
 wrote:  
 
 Hello,

y[i] and c[i] are character strings, they are not variables of data set mpg.
To get the variables, use, well, help("get").

Note that I have changed the temp dir to mine. So I created a variable 
to hold the value


tmpdir <- "c:/temp/"

for (i in seq(nrow(mac))){
  mpg %>%
    filter(hwy < 35) %>%
    ggplot(aes(x = displ, y = get(y[i]), color = get(c[i]))) +
    geom_point() +
    ylab(y[i]) +
    guides(color = guide_legend(title = c[i]))
  ggsave(
    paste0(tmpdir, f[i], ".jpg"),
    width = 9,
    height = 6,
    dpi = 1200,
    units = "in")
}



Like Ivan said, don't rely on auto print. In order to have to open the 
graphics files output by the loop I would have done something like the 
following.

First create a list to hold the plots. Inside the for loop save the 
plots in the list and explicitly print them. And use ggsave argument 
plot. Like this, after the loop you can see what you have by printing 
each list member.


p <- vector("list", length = nrow(mac))
for (i in seq(nrow(mac))){
  mpg %>%
    filter(hwy < 35) %>%
    ggplot(aes(x = displ, y = get(y[i]), color = get(c[i]))) +
    geom_point() +
    ylab(y[i]) +
    guides(color = guide_legend(title = c[i])) -> p[[i]]
  ggsave(
    paste0(tmpdir, f[i], ".jpg"),
    plot = p[[i]],
    width = 9,
    height = 6,
    dpi = 1200,
    units = "in")
}

# See the first plot
p[[1]]


Hope this helps,

Rui Barradas

Às 18:18 de 22/12/21, Kai Yang via R-help escreveu:
>  Hello Eric, Jim and Ivan,
> Many thanks all of your help. I'm a new one in R area. I may not fully 
> understand the idea from you.  I modified my code below, I can get the plots 
> out with correct file name, but plots  are not using correct fields' name. it 
> use y[i], and c[i] as variables' name, does not use hwy, cyl or cty, class in 
> ggplot statement. And there is not any error message. Could you please look 
> into my modified code below and let me know how to modify y= y[i], color = 
> c[i] part?
> Thanks,
> Kai
> 
> y <- c("hwy","cty")
> c <- c("cyl","class")
> f <- c("hwy_cyl","cty_class")
> mac <- data.frame(y,c,f)
> for (i in seq(nrow(mac))){
>    mpg %>%
>      filter(hwy <35) %>%
>      ggplot(aes(x = displ, y = y[i], color = c[i])) +
>      geom_point()
>    ggsave(paste0("c:/temp/",f[i],".jpg"),width = 9, height = 6, dpi = 1200, 
>units = "in")
> }
> 
>      On Wednesday, December 22, 2021, 09:42:45 AM PST, Ivan Krylov 
> wrote:
>  
>  On Wed, 22 Dec 2021 16:58:18 + (UTC)
> Kai Yang via R-help  wrote:
> 
>> mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, y = y[i],
>> color = c[i])) +     geom_point()
> 
> Your code relies on R's auto-printing, where each line of code executed
> at the top level (not in loops or functions) is run as if it was
> wrapped in print(...the rest of the line...).
> 
> Solution: make that print() explicit.
> 
> A better solution: explicitly pass the plot object returned by the
> ggplot functions to the ggsave() function instead of relying on the
> global state of the program.
> 
>> ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units =
>> "in")
> 
> When you type "c:/temp/f[i].jpg", what do you get in return?
> 
> Use paste0() or sprintf() to compose strings out of parts.
> 
>>      [[alternative HTML version deleted]]
> 
> P.S. Please compose your messages in plain text, not HTML. See the
> R-help posting guide for more info.
> 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop question in R

2021-12-22 Thread Kai Yang via R-help
 strange, I got error message when I run again:
Error: unexpected symbol in:
"    geom_point()
  ggsave"
> }
Error: unexpected '}' in "}"

On Wednesday, December 22, 2021, 10:18:56 AM PST, Kai Yang 
 wrote:  
 
  Hello Eric, Jim and Ivan,
Many thanks all of your help. I'm a new one in R area. I may not fully 
understand the idea from you.  I modified my code below, I can get the plots 
out with correct file name, but plots  are not using correct fields' name. it 
use y[i], and c[i] as variables' name, does not use hwy, cyl or cty, class in 
ggplot statement. And there is not any error message. Could you please look 
into my modified code below and let me know how to modify y= y[i], color = c[i] 
part?
Thanks,
Kai

y <- c("hwy","cty")
c <- c("cyl","class")
f <- c("hwy_cyl","cty_class")
mac <- data.frame(y,c,f)
for (i in seq(nrow(mac))){
  mpg %>%
    filter(hwy <35) %>% 
    ggplot(aes(x = displ, y = y[i], color = c[i])) + 
    geom_point()
  ggsave(paste0("c:/temp/",f[i],".jpg"),width = 9, height = 6, dpi = 1200, 
units = "in")
}

    On Wednesday, December 22, 2021, 09:42:45 AM PST, Ivan Krylov 
 wrote:  
 
 On Wed, 22 Dec 2021 16:58:18 + (UTC)
Kai Yang via R-help  wrote:

> mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, y = y[i],
> color = c[i])) +     geom_point()

Your code relies on R's auto-printing, where each line of code executed
at the top level (not in loops or functions) is run as if it was
wrapped in print(...the rest of the line...).

Solution: make that print() explicit.

A better solution: explicitly pass the plot object returned by the
ggplot functions to the ggsave() function instead of relying on the
global state of the program.

> ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units =
> "in")

When you type "c:/temp/f[i].jpg", what do you get in return?

Use paste0() or sprintf() to compose strings out of parts.

>     [[alternative HTML version deleted]]

P.S. Please compose your messages in plain text, not HTML. See the
R-help posting guide for more info.

-- 
Best regards,
Ivan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop question in R

2021-12-22 Thread Kai Yang via R-help
 Hello Eric, Jim and Ivan,
Many thanks all of your help. I'm a new one in R area. I may not fully 
understand the idea from you.  I modified my code below, I can get the plots 
out with correct file name, but plots  are not using correct fields' name. it 
use y[i], and c[i] as variables' name, does not use hwy, cyl or cty, class in 
ggplot statement. And there is not any error message. Could you please look 
into my modified code below and let me know how to modify y= y[i], color = c[i] 
part?
Thanks,
Kai

y <- c("hwy","cty")
c <- c("cyl","class")
f <- c("hwy_cyl","cty_class")
mac <- data.frame(y,c,f)
for (i in seq(nrow(mac))){
  mpg %>%
    filter(hwy <35) %>% 
    ggplot(aes(x = displ, y = y[i], color = c[i])) + 
    geom_point()
  ggsave(paste0("c:/temp/",f[i],".jpg"),width = 9, height = 6, dpi = 1200, 
units = "in")
}

On Wednesday, December 22, 2021, 09:42:45 AM PST, Ivan Krylov 
 wrote:  
 
 On Wed, 22 Dec 2021 16:58:18 + (UTC)
Kai Yang via R-help  wrote:

> mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, y = y[i],
> color = c[i])) +     geom_point()

Your code relies on R's auto-printing, where each line of code executed
at the top level (not in loops or functions) is run as if it was
wrapped in print(...the rest of the line...).

Solution: make that print() explicit.

A better solution: explicitly pass the plot object returned by the
ggplot functions to the ggsave() function instead of relying on the
global state of the program.

> ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units =
> "in")

When you type "c:/temp/f[i].jpg", what do you get in return?

Use paste0() or sprintf() to compose strings out of parts.

>     [[alternative HTML version deleted]]

P.S. Please compose your messages in plain text, not HTML. See the
R-help posting guide for more info.

-- 
Best regards,
Ivan
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] for loop question in R

2021-12-22 Thread Kai Yang via R-help
Hello R team,I want to use for loop to generate multiple plots with 3 
parameter, (y is for y axis, c is for color and f is for file name in output). 
I created a data frame to save the information and use the information in for 
loop. I use y[i], c[i] and f[i] in the loop, but it seems doesn't work. Can 
anyone correct my code to make it work?
Thanks,Kai

library(ggplot2)library(tidyverse)
y <- c("hwy","cty")c <- c("cyl","class")f <- c("hwy_cyl","cty_class")
mac <- data.frame(y,c,f)
for (i in nrow(mac)){  mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, 
y = y[i], color = c[i])) +     geom_point()  ggsave("c:/temp/f[i].jpg",width = 
9, height = 6, dpi = 1200, units = "in")}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem: try to passing macro value into submit block

2021-12-21 Thread Kai Yang via R-help
Hi team,I'm trying to pass macro variable into R script in Proc iml. I want to 
do change variable in color= and export the result with different file name.If 
I don't use macro, the code work well. But when I try to use macro below, I got 
error message: "Submit block cannot be directly placed in a macro. Instead, 
place the submit block into a file first and then use %include to include the 
file within a macro definition.". After reading the message, I still not sure 
how to fix the problem in the code. Anyone can help me?
Thank you,Kai
%macro pplot(a);proc iml;
submit / R;
library(ggplot2)library(tidyverse)
mpg %>%  filter(hwy <35) %>%   ggplot(aes(x = displ, y = hwy, color = )) +   
geom_point()ggsave("c:/temp/")
endsubmit;
quit;%mend;%pplot(drv);%pplot(cyl);

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset data frame problem

2021-12-13 Thread Kai Yang via R-help
 Thans Richard, it works well. --- Kai
On Monday, December 13, 2021, 04:00:33 AM PST, Richard O'Keefe 
 wrote:  
 
 You want to DELETE rows satisfying the condition P & Q.The subset() function 
requires an expression saying whatyou want to RETAIN, so you need subset(PD, 
!(P & Q)).
test <- subset(PD, !(Class == "1st" & Survived == "No"))
By de Morgan's laws, !(P & Q) is the same as (!P) | (!Q)so you could also write
test <- subset(PD, Class != "1st" | Survived != "No")
I'd actually be tempted to do this in two steps:
unwanted <- PD$Class == "1st" & PD$Survived == "No"test <- PD[!unwanted,]



On Mon, 13 Dec 2021 at 17:30, Kai Yang via R-help  wrote:

Hi R team,I want to delete records from a data frame if Class = '1st' and 
Survived = 'No'. I wrote the code below, test <- subset(PD, Class != '1st' && 
Survived != 'No')
but the code return a wrong result. Can someone help me for this? 
Thanks,Kai
        [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subset data frame problem

2021-12-12 Thread Kai Yang via R-help
Hi R team,I want to delete records from a data frame if Class = '1st' and 
Survived = 'No'. I wrote the code below, test <- subset(PD, Class != '1st' && 
Survived != 'No')
but the code return a wrong result. Can someone help me for this? 
Thanks,Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GTsummary add title with multi row

2021-12-06 Thread Kai Yang via R-help
Hi R team,When I use GTsummary to run summary table, I can add table title by 
using modify_caption() function. But it is one row title only.Is it possible to 
add title with multi rows for one summary table?Thanks,Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SOMAscan data analysis

2021-12-02 Thread Kai Yang via R-help
Hello R team,we have a huge SOMAscan data set. This is an aptamer-based 
protecomics assay capable of measuring 1305 human protein analytes. does anyone 
know which package can load the data and do analysis? I apricate any 
suggestion, your experience, web page, paper 
Thank you,Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2 bar chart: order display for each group

2021-09-20 Thread Kai Yang via R-help
Hello List,

I submitted the code below, it will show two groups of avg_time bar chart for 
each gc_label.

ggplot(s8_GCtime, aes(fill=GTresult, y=avg_time, x=gc_label, label = avg_time)) 
+ 
  geom_bar(position=position_dodge(), stat="identity") +
  geom_text(aes(label=avg_time), vjust=1.6, position = position_dodge(0.9), 
size=3.5)+
  theme(axis.text.x = element_text(angle = 45))


I found the ggplot put all of small value of avg_time on left side, bigger 
value of avg_time on right side for each gc_label. But I hope to control the 
order by GTresult. Could you tell me how to do this?

Thanks,
Kai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] order stacked bar plot from total frequency

2021-09-16 Thread Kai Yang via R-help
Hello List,
I can order the general bar chart based on frequency, using ggplot2 with the 
code : aes((reorder(Var1, Freq)), Freq)) 
I don't know how to order stacked bar plot by total frequency
Can you give me some of example?
Thank you
Kai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ODP: ggplot question

2021-09-16 Thread Kai Yang via R-help
 Hi Grzegorz,
this is great! it works for me.
Thank you,
Kai
On Wednesday, September 15, 2021, 11:09:20 PM PDT, Grzegorz Smoliński 
 wrote:  
 
 Hi,

of course you can. This should work:

ggplot(s8_plot, aes(fill=GTresult, y=cases, x=gc_label)) +
geom_bar(position="stack", stat="identity")) +
theme(axis.text.x = element_text(angle = 90, hjust = 1))

By "hjust" you make sure that labels do not overlap on plot.

Best regards,

Grzegorz

śr., 15 wrz 2021 o 21:03 Kai Yang  napisał(a):
>
> Hi Grzegorz,
>
> You are correct. it works now.
>
> One more question: can I turn gc_label 90 degree in plot?
>
> Thank you
>
> Kai
>
> On Wednesday, September 15, 2021, 10:54:52 AM PDT, Grzegorz Smoliński 
>  wrote:
>
>
> Hi,
>
> Isn’t a bracket missing after gc_label?
>
> So it should be:
>
> > ggplot(s8_plot, aes(fill=GTresult, y=cases, x=gc_label)) +
>
> +  geom_bar(position="stack", stat="identity"))
>
> Best,
>
> Grzegorz
>
> Od: Kai Yang via R-help
> Wysłano: środa, 15 września 2021 19:50
> Do: R-help Mailing List
> Temat: [R] ggplot question
>
>
>
> Hello List,
>
> I use ggplot to draw a stack bar chart. but I error message. please
> look it below:
>
>
>
> > ggplot(s8_plot, aes(fill=GTresult, y=cases, x=gc_label) +
>
> +  geom_bar(position="stack", stat="identity"))
>
>
>
> Error: Mapping should be created with `aes()` or `aes_()`.
>
>
>
> GTresult and gc_label are character variables, cases is numeric
> variable. How to fix the problem?
>
> Thank you
>
> Kai
>
>
>
>          [[alternative HTML version deleted]]
>
>
>
> __
>
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>
> https://stat.ethz.ch/mailman/listinfo/r-help
>
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>
> and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ODP: ggplot question

2021-09-15 Thread Kai Yang via R-help
 Hi Grzegorz,
You are correct. it works now.
One more question: can I turn gc_label 90 degree in plot?
Thank you
Kai
On Wednesday, September 15, 2021, 10:54:52 AM PDT, Grzegorz Smoliński 
 wrote:  
 
 Hi,

Isn’t a bracket missing after gc_label?

So it should be:

> ggplot(s8_plot, aes(fill=GTresult, y=cases, x=gc_label)) +

+  geom_bar(position="stack", stat="identity"))

Best,

Grzegorz

Od: Kai Yang via R-help
Wysłano: środa, 15 września 2021 19:50
Do: R-help Mailing List
Temat: [R] ggplot question



Hello List,

I use ggplot to draw a stack bar chart. but I error message. please
look it below:



> ggplot(s8_plot, aes(fill=GTresult, y=cases, x=gc_label) +

+  geom_bar(position="stack", stat="identity"))



Error: Mapping should be created with `aes()` or `aes_()`.



GTresult and gc_label are character variables, cases is numeric
variable. How to fix the problem?

Thank you

Kai



          [[alternative HTML version deleted]]



__

R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

https://stat.ethz.ch/mailman/listinfo/r-help

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot question

2021-09-15 Thread Kai Yang via R-help
Hello List,
I use ggplot to draw a stack bar chart. but I error message. please look it 
below:

> ggplot(s8_plot, aes(fill=GTresult, y=cases, x=gc_label) + 
+   geom_bar(position="stack", stat="identity"))

Error: Mapping should be created with `aes()` or `aes_()`.

GTresult and gc_label are character variables, cases is numeric variable. How 
to fix the problem?
Thank you
Kai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dbWriteTable does not append rows in database

2021-09-13 Thread Kai Yang via R-help
Hi List,
I want to append some rows from R into sql server. So, I submitted the code 
below. there is not any error message:
dbWriteTable(conn = con, name = "PMDB._Alias_A", value = try1, overwrite=FALSE, 
append=TRUE, row.names = FALSE)
But when I try to query the data from the Sql server, I can not find the 
records were appended. Did I miss something in the code?
Thank you,
Kai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to find "first" or "last" record after sort in R

2021-09-10 Thread Kai Yang via R-help
 Thanks Therneau, duplicated() function works well. --- Kai
On Friday, September 10, 2021, 05:13:47 AM PDT, Therneau, Terry M., Ph.D. 
 wrote:  
 
 I prefer the duplicated() function, since the final code will be clear to a 
future reader. 
  (Particularly when I am that future reader).

last <- !duplicated(mydata$ID, fromLast=TRUE)  # point to the last ID for each 
subject
mydata$data3[last] <- NA

Terry T.

(I read the list once a day in digest form, so am always a late reply.)

On 9/10/21 5:00 AM, r-help-requ...@r-project.org wrote:
> Hello List,
> Please look at the sample data frame below:
> 
> ID         date1              date2             date3
> 1    2015-10-08    2015-12-17    2015-07-23
> 
> 2    2016-01-16    NA                 2015-10-08
> 3    2016-08-01    NA                 2017-01-10
> 3    2017-01-10    NA                 2016-01-16
> 4    2016-01-19    2016-02-24   2016-08-01
> 5    2016-03-01    2016-03-10   2016-01-19
> This data frame was sorted by ID and date1. I need to set the column date3 as 
> missing for the "last" record for each ID. In the sample data set, the ID 1, 
> 2, 4 and 5 has one row only, so they can be consider as first and last 
> records. the data3 can be set as missing. But the ID 3 has 2 rows. Since I 
> sorted the data by ID and date1, the ID=3 and date1=2017-01-10 should be the 
> last record only. I need to set date3=NA for this row only.
> 
> the question is, how can I identify the "last" record and set it as NA in 
> date3 column.
> Thank you,
> Kai
>     [[alternative HTML version deleted]]
> 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to find "first" or "last" record after sort in R

2021-09-09 Thread Kai Yang via R-help
Hello List,
Please look at the sample data frame below:

ID         date1              date2             date3
1    2015-10-08    2015-12-17    2015-07-23

2    2016-01-16    NA                 2015-10-08
3    2016-08-01    NA                 2017-01-10
3    2017-01-10    NA                 2016-01-16
4    2016-01-19    2016-02-24   2016-08-01
5    2016-03-01    2016-03-10   2016-01-19
This data frame was sorted by ID and date1. I need to set the column date3 as 
missing for the "last" record for each ID. In the sample data set, the ID 1, 2, 
4 and 5 has one row only, so they can be consider as first and last records. 
the data3 can be set as missing. But the ID 3 has 2 rows. Since I sorted the 
data by ID and date1, the ID=3 and date1=2017-01-10 should be the last record 
only. I need to set date3=NA for this row only.

the question is, how can I identify the "last" record and set it as NA in date3 
column.
Thank you,
Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot error of "`data` must be a data frame, or other object coercible by `fortify()`, not an S3 object with class rxlsx"

2021-08-26 Thread Kai Yang via R-help
 Thank you all of your help. The error message gone.
On Thursday, August 26, 2021, 04:07:59 PM PDT, Avi Gross via R-help 
 wrote:  
 
 This illustrates many things but in particular, why there is a difference 
between saying you tried:

 

    class(eth)

 

And saying the function you (think you) called is documented to return a 
data.frame.

 

Just typing something asking for the class would rapidly have shown it was not 
a data.frame and also what it was. True, having multiple packages in some order 
overlay each other is a bit subtle for some and I am glad quite a few people 
here noticed it. 

 

It may indeed make sense to more fully specify package::function notation in 
anything you let others use as they may indeed load more packages …

 

From: John C Frain  
Sent: Thursday, August 26, 2021 3:17 PM
To: Kai Yang 
Cc: r-help@r-project.org; Avi Gross 
Subject: Re: [R] ggplot error of "`data` must be a data frame, or other object 
coercible by `fortify()`, not an S3 object with class rxlsx"

 

officer redefines the read_xlsx command.  You should have got a message to that 
effect when you loaded the officer package.  You can use the version from the 
readxl package with

 

readxl::read_xlsx()  command.




John C Frain

3 Aranleigh Park

Rathfarnham
Dublin 14
Ireland
www.tcd.ie/Economics/staff/frainj/home.html 
<http://www.tcd.ie/Economics/staff/frainj/home.html> 

https://jcfrain.wordpress.com/

https://jcfraincv19.wordpress.com/


mailto:fra...@tcd.ie <mailto:fra...@tcd.ie> 
mailto:fra...@gmail.com <mailto:fra...@gmail.com> 

 

 

On Thu, 26 Aug 2021 at 20:04, Kai Yang via R-help mailto:r-help@r-project.org> > wrote:

 Hi all,
I found something, but I don't know why it happen.
when I submitted the following code, the Eth is data frame. I can see 14 obs. 
of 2 variables
library(readxl)
library(ggplot2)
eth <- read_xlsx("c:/temp/eth.xlsx")


but when I add more package (see below,) the Eth is "List of 1"
library(readxl)
library(ggplot2)
library(dplyr)
library(magrittr)
library(knitr)
library(xtable)
library(flextable)
library(officer)
eth <- read_xlsx("c:/temp/eth.xlsx")

But I need those package in future. Is there a way to fix the problem?
Thanks,
Kai    On Thursday, August 26, 2021, 11:37:53 AM PDT, Kai Yang via R-help 
mailto:r-help@r-project.org> > wrote:  

  Hi All,
1. the eth is a data frame (not sure that based on error message?) that I load 
it from excel file. Here is the code: eth <- read_xlsx("c:/temp/eth.xlsx")
2. I try to use the code to convert eth into eth2, but I got error message:
> eth2 <- data.frame(eth)
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = 
stringsAsFactors) : 
  cannot coerce class ‘"rxlsx"’ to a data.frame

So, it seems the data.frame can not do this data convert? Do you know which 
statement/function can do this?


thank you for your help.

    On Thursday, August 26, 2021, 09:33:51 AM PDT, Avi Gross via R-help 
mailto:r-help@r-project.org> > wrote:  

 Kai,

The answer is fairly probable to find  if you examine your variable "eth" as 
that is the only time you are being asked to provide the argument as in 
"ggplot(data=eth, ..) ...)

As the message states, it expects that argument to be a data frame or something 
it can change into a data.frame. What you gave it probably is an object meant 
to represent an EXCEL file or something. You may need to extract a data.frame 
(or tibble or ...) from it before passing that to ggplot.

Avi

-Original Message-
From: R-help mailto:r-help-boun...@r-project.org> > On Behalf Of Kai Yang via R-help
Sent: Thursday, August 26, 2021 11:53 AM
To: R-help Mailing List mailto:r-help@r-project.org> >
Subject: [R] ggplot error of "`data` must be a data frame, or other object 
coercible by `fortify()`, not an S3 object with class rxlsx"

Hello List,
I got an error message when I submit the code below ggplot(eth, aes(ymax=ymax, 
ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +  geom_rect() +  
coord_polar(theta="y")  +  xlim(c(2, 4)  ) 

Error: `data` must be a data frame, or other object coercible by `fortify()`, 
not an S3 object with class rxlsx


I checked the syntax. But I can  not find any error on my code. Can you help me 
to find where is the problem?

Thanks

    [[alternative HTML version deleted]]

__
R-help@r-project.org <mailto:R-help@r-project.org>  mailing list -- To 
UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org <mailto:R-help@r-project.org>  mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE 

Re: [R] ggplot error of "`data` must be a data frame, or other object coercible by `fortify()`, not an S3 object with class rxlsx"

2021-08-26 Thread Kai Yang via R-help
 Hi all,
I found something, but I don't know why it happen.
when I submitted the following code, the Eth is data frame. I can see 14 obs. 
of 2 variables
library(readxl)
library(ggplot2)
eth <- read_xlsx("c:/temp/eth.xlsx")


but when I add more package (see below,) the Eth is "List of 1"
library(readxl)
library(ggplot2)
library(dplyr)
library(magrittr)
library(knitr)
library(xtable)
library(flextable)
library(officer)
eth <- read_xlsx("c:/temp/eth.xlsx")

But I need those package in future. Is there a way to fix the problem?
Thanks,
KaiOn Thursday, August 26, 2021, 11:37:53 AM PDT, Kai Yang via R-help 
 wrote:  
 
  Hi All,
1. the eth is a data frame (not sure that based on error message?) that I load 
it from excel file. Here is the code: eth <- read_xlsx("c:/temp/eth.xlsx")
2. I try to use the code to convert eth into eth2, but I got error message:
> eth2 <- data.frame(eth)
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = 
stringsAsFactors) : 
  cannot coerce class ‘"rxlsx"’ to a data.frame

So, it seems the data.frame can not do this data convert? Do you know which 
statement/function can do this?


thank you for your help.

    On Thursday, August 26, 2021, 09:33:51 AM PDT, Avi Gross via R-help 
 wrote:  
 
 Kai,

The answer is fairly probable to find  if you examine your variable "eth" as 
that is the only time you are being asked to provide the argument as in 
"ggplot(data=eth, ..) ...)

As the message states, it expects that argument to be a data frame or something 
it can change into a data.frame. What you gave it probably is an object meant 
to represent an EXCEL file or something. You may need to extract a data.frame 
(or tibble or ...) from it before passing that to ggplot.

Avi

-Original Message-
From: R-help  On Behalf Of Kai Yang via R-help
Sent: Thursday, August 26, 2021 11:53 AM
To: R-help Mailing List 
Subject: [R] ggplot error of "`data` must be a data frame, or other object 
coercible by `fortify()`, not an S3 object with class rxlsx"

Hello List,
I got an error message when I submit the code below ggplot(eth, aes(ymax=ymax, 
ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +  geom_rect() +  
coord_polar(theta="y")  +  xlim(c(2, 4)  ) 

Error: `data` must be a data frame, or other object coercible by `fortify()`, 
not an S3 object with class rxlsx


I checked the syntax. But I can  not find any error on my code. Can you help me 
to find where is the problem?

Thanks

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot error of "`data` must be a data frame, or other object coercible by `fortify()`, not an S3 object with class rxlsx"

2021-08-26 Thread Kai Yang via R-help
 Hi All,
1. the eth is a data frame (not sure that based on error message?) that I load 
it from excel file. Here is the code: eth <- read_xlsx("c:/temp/eth.xlsx")
2. I try to use the code to convert eth into eth2, but I got error message:
> eth2 <- data.frame(eth)
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = 
stringsAsFactors) : 
  cannot coerce class ‘"rxlsx"’ to a data.frame

So, it seems the data.frame can not do this data convert? Do you know which 
statement/function can do this?


thank you for your help.

On Thursday, August 26, 2021, 09:33:51 AM PDT, Avi Gross via R-help 
 wrote:  
 
 Kai,

The answer is fairly probable to find  if you examine your variable "eth" as 
that is the only time you are being asked to provide the argument as in 
"ggplot(data=eth, ..) ...)

As the message states, it expects that argument to be a data frame or something 
it can change into a data.frame. What you gave it probably is an object meant 
to represent an EXCEL file or something. You may need to extract a data.frame 
(or tibble or ...) from it before passing that to ggplot.

Avi

-Original Message-----
From: R-help  On Behalf Of Kai Yang via R-help
Sent: Thursday, August 26, 2021 11:53 AM
To: R-help Mailing List 
Subject: [R] ggplot error of "`data` must be a data frame, or other object 
coercible by `fortify()`, not an S3 object with class rxlsx"

Hello List,
I got an error message when I submit the code below ggplot(eth, aes(ymax=ymax, 
ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +  geom_rect() +  
coord_polar(theta="y")  +  xlim(c(2, 4)  ) 

Error: `data` must be a data frame, or other object coercible by `fortify()`, 
not an S3 object with class rxlsx


I checked the syntax. But I can  not find any error on my code. Can you help me 
to find where is the problem?

Thanks

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot error of "`data` must be a data frame, or other object coercible by `fortify()`, not an S3 object with class rxlsx"

2021-08-26 Thread Kai Yang via R-help
Hello List,
I got an error message when I submit the code below
ggplot(eth, aes(ymax=ymax, ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +  
geom_rect() +  coord_polar(theta="y")  +  xlim(c(2, 4)   ) 

Error: `data` must be a data frame, or other object coercible by `fortify()`, 
not an S3 object with class rxlsx


I checked the syntax. But I can  not find any error on my code. Can you help me 
to find where is the problem?

Thanks

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] data manipulation question

2021-08-23 Thread Kai Yang via R-help
Hello List,
I wrote the script below to assign value to a new field DisclosureStatus.
my goal is if gl_resultsdisclosed=1 then DisclosureStatus=DISCLOSED
else if gl_resultsdisclosed=0 then DisclosureStatus= ATTEMPTED
else if gl_resultsdisclosed is missing and gl_discloseattempt1 is not missing 
then DisclosureStatus= ATTEMPTED
else missing


germlinepatients$DisclosureStatus <- 
              ifelse(germlinepatients$gl_resultsdisclosed==1, "DISCLOSED",
                ifelse(germlinepatients$ gl_resultsdisclosed==0, "ATTEMPTED", 
                   ifelse(is.na(germlinepatients$gl_resultsdisclosed) & 
germlinepatients$gl_discloseattempt1!='', "ATTEMPTED",
                                                           NA)))

the first 3 row give me right result, but the 3rd row does not. After checking 
the data, there are 23 cases are gl_resultsdisclosed is missing and 
gl_discloseattempt1 is not missing.  the code doesn't has any error message.
Please help 
thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot: add percentage for each element in legend and remove tick mark

2021-08-13 Thread Kai Yang via R-help
 Got it.Thank you.
On Friday, August 13, 2021, 03:03:26 PM PDT, Bert Gunter 
 wrote:  
 
 It's dput()  *not* dupt() .  ?dput tells you how to use it (as usual).

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Fri, Aug 13, 2021 at 2:48 PM Kai Yang via R-help
 wrote:
>
>  Hello John,
> I put my testing data below. I'm not sure how to use dupt() function. would 
> you please give me an example?
> Thanks,
> Kai
>
> |
> ethnicity |
> individuals |
> | Caucasian | 36062 |
> | Ashkenazi Jewish | 4309 |
> | Multiple | 3193 |
> | Hispanic | 2113 |
> | Asian. not specified | 1538 |
> | Chinese | 1031 |
> | African | 643 |
> | Unknown | 510 |
> | Filipino | 222 |
> | Japanese | 129 |
> | Native American | 116 |
> | Indian | 111 |
> | Pacific Islander | 23 |
>
>
>
>    On Friday, August 13, 2021, 06:21:29 AM PDT, John Kane 
> wrote:
>
>  Would you supply some sample data please? A handy way to supply sample
> data is to use the dput() function. See ?dput.  If you have a very
> large data set then something like head(dput(myfile), 100) will likely
> supply enough data for us to work with.
>
> On Thu, 12 Aug 2021 at 11:45, Kai Yang via R-help  
> wrote:
> >
> > Hello List,
> > I use the following code to generate a donut plot.
> > # Compute percentages
> > eth$fraction = eth$individuals / sum(eth$individuals)
> > # Compute the cumulative percentages (top of each rectangle)
> > eth$ymax = cumsum(eth$fraction)
> > # Compute the bottom of each rectangle
> > eth$ymin = c(0, head(eth$ymax, n=-1))
> > # Make the plot using percentage
> > ggplot(eth, aes(ymax=ymax, ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +
> >  geom_rect() +
> >  coord_polar(theta="y")  +
> >  xlim(c(2, 4)
> >  )
> >
> > I want to improve the plot for two thing:
> > 1. the legend: I need to add percentage (eth$fraction * 100 and then add %) 
> > for each of element.
> > 2. remove all number (tick mark ?) around the plot
> > Please help
> > Thank you,
> > Kai
> >
> >        [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> John Kane
> Kingston ON Canada
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot: add percentage for each element in legend and remove tick mark

2021-08-13 Thread Kai Yang via R-help
 Hello John,
I put my testing data below. I'm not sure how to use dupt() function. would you 
please give me an example?
Thanks,
Kai

| 
ethnicity | 
individuals |
| Caucasian | 36062 |
| Ashkenazi Jewish | 4309 |
| Multiple | 3193 |
| Hispanic | 2113 |
| Asian. not specified | 1538 |
| Chinese | 1031 |
| African | 643 |
| Unknown | 510 |
| Filipino | 222 |
| Japanese | 129 |
| Native American | 116 |
| Indian | 111 |
| Pacific Islander | 23 |



On Friday, August 13, 2021, 06:21:29 AM PDT, John Kane 
 wrote:  
 
 Would you supply some sample data please? A handy way to supply sample
data is to use the dput() function. See ?dput.  If you have a very
large data set then something like head(dput(myfile), 100) will likely
supply enough data for us to work with.

On Thu, 12 Aug 2021 at 11:45, Kai Yang via R-help  wrote:
>
> Hello List,
> I use the following code to generate a donut plot.
> # Compute percentages
> eth$fraction = eth$individuals / sum(eth$individuals)
> # Compute the cumulative percentages (top of each rectangle)
> eth$ymax = cumsum(eth$fraction)
> # Compute the bottom of each rectangle
> eth$ymin = c(0, head(eth$ymax, n=-1))
> # Make the plot using percentage
> ggplot(eth, aes(ymax=ymax, ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +
>  geom_rect() +
>  coord_polar(theta="y")  +
>  xlim(c(2, 4)
>  )
>
> I want to improve the plot for two thing:
> 1. the legend: I need to add percentage (eth$fraction * 100 and then add %) 
> for each of element.
> 2. remove all number (tick mark ?) around the plot
> Please help
> Thank you,
> Kai
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
John Kane
Kingston ON Canada
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot: add percentage for each element in legend and remove tick mark

2021-08-12 Thread Kai Yang via R-help
Hello List,
I use the following code to generate a donut plot.
# Compute percentages
eth$fraction = eth$individuals / sum(eth$individuals)
# Compute the cumulative percentages (top of each rectangle)
eth$ymax = cumsum(eth$fraction)
# Compute the bottom of each rectangle
eth$ymin = c(0, head(eth$ymax, n=-1))
# Make the plot using percentage
ggplot(eth, aes(ymax=ymax, ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +
  geom_rect() +
  coord_polar(theta="y")  +
  xlim(c(2, 4) 
  ) 

I want to improve the plot for two thing: 
1. the legend: I need to add percentage (eth$fraction * 100 and then add %) for 
each of element.
2. remove all number (tick mark ?) around the plot
Please help
Thank you,
Kai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ls() pattern question

2021-07-14 Thread Kai Yang via R-help
 Thanks Andrew. it works well. --- Kai
On Wednesday, July 14, 2021, 05:22:01 PM PDT, Bert Gunter 
 wrote:  
 
 Actually fun( param != something..) is syntactically incorrect in the first 
place for any function! 

ls sees "pat != whatever"  as the "name" argument of ls() and can't make any 
sense of it, of course. 
 
Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Jul 14, 2021 at 5:01 PM Andrew Simmons  wrote:

Hello,


First, `ls` does not support `!=` for pattern, but it's actually throwing a
different error. For `rm`, the objects provided into `...` are substituted
(not evaluated), so you should really do something like

rm(list = ls(pattern = ...))

As for all except "con", "DB2", and "ora", I would try something like

setdiff(ls(), c("con", "DB2", "ora"))

and then add `rm` to that like

rm(list = setdiff(ls(), c("con", "DB2", "ora")))

On Wed, Jul 14, 2021 at 7:41 PM Kai Yang via R-help 
wrote:

> Hello List,
> I have many data frames in environment.  I need to keep 3 data frames
> only, con DB2 and ora.
> I write the script to do this.
> rm(ls(pattern != c("(con|DB2|ora)")))
>
>
> but it give me an error message:
>
>
> Error in rm(ls(pattern != c("(con|DB2|ora)"))) :
>   ... must contain names or character strings
>
> I think the pattern option doesn't support != ? and is it possible to fix
> this?
> Thank you,
> Kai
>
>
>
>         [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

        [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ls() pattern question

2021-07-14 Thread Kai Yang via R-help
Hello List,
I have many data frames in environment.  I need to keep 3 data frames only, con 
DB2 and ora. 
I write the script to do this. 
rm(ls(pattern != c("(con|DB2|ora)")))


but it give me an error message:


Error in rm(ls(pattern != c("(con|DB2|ora)"))) : 
  ... must contain names or character strings

I think the pattern option doesn't support != ? and is it possible to fix this?
Thank you,
Kai



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] assign a data frame name from a list in do loop

2021-07-14 Thread Kai Yang via R-help
 Hello Rui,
it's very helpful. 
Thank you,
Kai
On Wednesday, July 14, 2021, 10:07:57 AM PDT, Rui Barradas 
 wrote:  
 
 Hello,

Just before for(j in 1:nrow(ora)) include the following code line (I 
have removed the underscore):

sdif <- vector("list", length = nrow(ora))


In the loop:

sdif[[j]] <- sqldf(etc)



Also, once again, why noquote? It's better to form file names with 
file.path:


rdcsv  <- file.path("w:/project/_Joe.B/Oracle/data", mycsv)


Hope this helps,

Rui Barradas


Às 16:55 de 14/07/2021, Kai Yang via R-help escreveu:
> Hello List,
> I wrote a script below to compare the difference of data frames structure 
> (and will do something else). First of all I save the file list in a data 
> frame ora, then I use for loop to 1. load the data from two resource, 2. 
> generate data structure into two data frames, 3.do the comparesion of the two 
> data frames of data structure and put it into a data frame, 4. remove the 
> useless data frames for next loop
> for (j in 1:nrow(ora))
> {
>    mycol  <- ora[j,"fname"]
> #work on csv
>    mycsv  <- paste0(mycol,".csv")
>    rdcsv  <- noquote(paste0("w:/project/_Joe.B/Oracle/data/", mycsv))
>    rr     <- read.csv(rdcsv)
>
> #work on SS 
>    myss   <- paste0("gemd.", mycol)
>    rdss   <- paste0('select * from ',myss)
>    ss     <- dbGetQuery(con, rdss)
>    
> #compare DF structure   
> str_rr <- as.data.frame(summary.default(rr))
> str_ss <- as.data.frame(summary.default(ss))
>
>
> sdif_[j] <- sqldf('select * from str_rr except select * from str_ss')
>
> #-remove data frame from memory--
>
>    rm(rr)
>    rm(ss)
>    rm(str_rr)
>    rm(str_ss)
> }
> In the step 3, I want to use ora$fname in the loop to assign a data frame 
> name. So, I will look the output later. But sdif_[j] doesn't work for this. 
> Can someone help me to fix this part?
> Thanks,
> Kai
>     [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
Este e-mail foi verificado em termos de vírus pelo software antivírus Avast.
https://www.avast.com/antivirus

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] assign a data frame name from a list in do loop

2021-07-14 Thread Kai Yang via R-help
Hello List,
I wrote a script below to compare the difference of data frames structure (and 
will do something else). First of all I save the file list in a data frame ora, 
then I use for loop to 1. load the data from two resource, 2. generate data 
structure into two data frames, 3.do the comparesion of the two data frames of 
data structure and put it into a data frame, 4. remove the useless data frames 
for next loop
for (j in 1:nrow(ora))
{
  mycol  <- ora[j,"fname"]
#work on csv  
  mycsv  <- paste0(mycol,".csv")
  rdcsv  <- noquote(paste0("w:/project/_Joe.B/Oracle/data/", mycsv))
  rr     <- read.csv(rdcsv)

#work on SS   
  myss   <- paste0("gemd.", mycol)
  rdss   <- paste0('select * from ',myss)
  ss     <- dbGetQuery(con, rdss)
  
#compare DF structure     
str_rr <- as.data.frame(summary.default(rr))
str_ss <- as.data.frame(summary.default(ss))


sdif_[j] <- sqldf('select * from str_rr except select * from str_ss')

#-remove data frame from memory--

  rm(rr)
  rm(ss)
  rm(str_rr)
  rm(str_ss)
}
In the step 3, I want to use ora$fname in the loop to assign a data frame name. 
So, I will look the output later. But sdif_[j] doesn't work for this. Can 
someone help me to fix this part?
Thanks,
Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem for strsplit function

2021-07-09 Thread Kai Yang via R-help
 Thanks Bert,
I'm reading some books now. But it takes me a while to get familiar R.

Best,
KaiOn Friday, July 9, 2021, 03:06:11 PM PDT, Duncan Murdoch 
 wrote:  
 
 On 09/07/2021 5:51 p.m., Jeff Newmiller wrote:
> "Strictly speaking", Greg is correct, Bert.
> 
> https://cran.r-project.org/doc/manuals/r-release/R-lang.html#List-objects
> 
> Lists in R are vectors. What we colloquially refer to as "vectors" are more 
> precisely referred to as "atomic vectors". And without a doubt, this "vector" 
> nature of lists is a key underlying concept that explains why adding a dim 
> attribute creates a matrix that can hold data frames. It is also a stumbling 
> block for programmers from other languages that have things like linked lists.

I would also object to v3 (below) as "extracting" a column from d. 
"d[2]" doesn't extract anything, it "subsets" the data frame, so the 
result is a data frame, not what you get when you extract something from 
a data frame.

People don't realize that "x <- 1:10; y <- x[[3]]" is perfectly legal. 
That extracts the 3rd element (the number 3).  The problem is that R has 
no way to represent a scalar number, only a vector of numbers, so x[[3]] 
gets promoted to a vector containing that number when it is returned and 
assigned to y.

Lists are vectors of R objects, so if x is a list, x[[3]] is something 
that can be returned, and it is different from x[3].

Duncan Murdoch

> 
> On July 9, 2021 2:36:19 PM PDT, Bert Gunter  wrote:
>> "1.  a column, when extracted from a data frame, *is* a vector."
>> Strictly speaking, this is false; it depends on exactly what is meant
>> by "extracted." e.g.:
>>
>>> d <- data.frame(col1 = 1:3, col2 = letters[1:3])
>>> v1 <- d[,2] ## a vector
>>> v2 <- d[[2]] ## the same, i.e
>>> identical(v1,v2)
>> [1] TRUE
>>> v3 <- d[2] ## a data.frame
>>> v1
>> [1] "a" "b" "c"  ## a character vector
>>> v3
>>  col2
>> 1    a
>> 2    b
>> 3    c
>>> is.vector(v1)
>> [1] TRUE
>>> is.vector(v3)
>> [1] FALSE
>>> class(v3)  ## data.frame
>> [1] "data.frame"
>> ## but
>>> is.list(v3)
>> [1] TRUE
>>
>> which is simply explained in ?data.frame (where else?!) by:
>> "A data frame is a **list** [emphasis added] of variables of the same
>> number of rows with unique row names, given class "data.frame". If no
>> variables are included, the row names determine the number of rows."
>>
>> "2.  maybe your question is "is a given function for a vector, or for a
>>    data frame/matrix/array?".  if so, i think the only way is reading
>>    the help information (?foo)."
>>
>> Indeed! Is this not what the Help system is for?! But note also that
>> the S3 class system may somewhat blur the issue: foo() may work
>> appropriately and differently for different (S3) classes of objects. A
>> detailed explanation of this behavior can be found in appropriate
>> resources or (more tersely) via ?UseMethod .
>>
>> "you might find reading ?"[" and  ?"[.data.frame" useful"
>>
>> Not just 'useful" -- **essential** if you want to work in R, unless
>> one gets this information via any of the numerous online tutorials,
>> courses, or books that are available. The Help system is accurate and
>> authoritative, but terse. I happen to like this mode of documentation,
>> but others may prefer more extended expositions. I stand by this claim
>> even if one chooses to use the "Tidyverse", data.table package, or
>> other alternative frameworks for handling data. Again, others may
>> disagree, but R is structured around these basics, and imo one remains
>> ignorant of them at their peril.
>>
>> Cheers,
>> Bert
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>> On Fri, Jul 9, 2021 at 11:57 AM Greg Minshall 
>> wrote:
>>>
>>> Kai,
>>>
 one more question, how can I know if the function is for column
 manipulations or for vector?
>>>
>>> i still stumble around R code.  but, i'd say the following (and look
>>> forward to being corrected! :):
>>>
>>> 1.  a column, when extracted from a data frame, *is* a vector.
>>>
>>> 2.  maybe your question is "is a given function for a vector, or for
>> a
>>>      data frame/matrix/array?".  if so, i think the only way is
>> reading
>>>      the help information (?foo).
>>>
>>> 3.  sometimes, extracting the column as a vector from a data
>> frame-like
>>>      object might be non-intuitive.  you might find reading ?"[" and
>>>      ?"[.data.frame" useful (as well as ?"[.data.table" if you use
>> that
>>>      package).  also, the str() command can be helpful in
>> understanding
>>>      what is happening.  (the lobstr:: package's sxp() function, as
>> well
>>>      as more verbose .Internal(inspect()) can also give you insight.)
>>>
>>>      with the data.table:: package, for example, if "DT" is a
>> data.table
>>>      object, with "x2" as a column, adding or leaving off quotation
>> marks
>>>      for the 

Re: [R] error message from read.csv in loop

2021-07-09 Thread Kai Yang via R-help
 Hi Migdonio,
I did try your code:
# Initialize the rr variable as a list.

rr <- as.list(rep(NA, nrow(ora)))


# Run the for-loop to store all the CSVs in rr.

for (j in 1:nrow(ora))

{

        mycol  <- ora[j,"fname"]

        mycsv  <- paste0(mycol,".csv")

        rdcsv  <- noquote(paste0("w:/project/_Joe.B/Oracle/data/", mycsv))

        rr[[j]]     <- read.csv(rdcsv)

}

this code is working, but rr is not a data frame, R said: Large list ( 20 
elements .). how can I use it as a data frame one by one?
Thank you for your help
Kai
On Friday, July 9, 2021, 11:39:59 AM PDT, Migdonio González 
 wrote:  
 
 It seems that your problem is that you are using single quotes inside of the 
double quotes. This is not necessary. Here is the corrected for-loop:
for (j in 1:nrow(ora))
{
        mycol  <- ora[j,"fname"]
        mycsv  <- paste0(mycol,".csv")
        rdcsv  <- noquote(paste0("w:/project/_Joe.B/Oracle/data/", mycsv))
        rr     <- read.csv(rdcsv)
}
Also note that the rr variable will only store the last CSV, not all CSV. You 
will need to initialize the rr variable as a list to store all CSVs if that is 
what you require. Something like this:
# Initialize the rr variable as a list.
rr <- as.list(rep(NA, nrow(ora)))
# Run the for-loop to store all the CSVs in rr.
for (j in 1:nrow(ora))
{
        mycol  <- ora[j,"fname"]
        mycsv  <- paste0(mycol,".csv")
        rdcsv  <- noquote(paste0("w:/project/_Joe.B/Oracle/data/", mycsv))
        rr[[j]]     <- read.csv(rdcsv)
} 

RegardsMigdonio G.

On Fri, Jul 9, 2021 at 1:10 PM Kai Yang via R-help  wrote:

Hello List,
I use for loop to read csv difference file into data frame rr.  The data frame 
rr will be deleted after a comparison and go to the next csv file.  Below is my 
code:
for (j in 1:nrow(ora))
{
  mycol  <- ora[j,"fname"]
  mycsv  <- paste0(mycol,".csv'")
  rdcsv  <- noquote(paste0("'w:/project/_Joe.B/Oracle/data/", mycsv))
  rr     <- read.csv(rdcsv)
}
but when I run this code, I got error message below:
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file ''w:/project/_Joe.B/Oracle/data/ASSAY_DEFINITIONS.csv'': No 
such file or directory

so, I checked the rdcsv and print it out, see below:
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_DISCRETE_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_QUESTIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_RUNS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/DATA_ENTRY_PAGES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/DISCRETE_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ENTRY_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_CODELIST_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_CODELIST_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_LOT_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_SAMPLES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/MOLECULAR_WAREHOUSE.csv'
[1] 'w:/project/_Joe.B/Oracle/data/QUESTION_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/QUESTION_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/RESPONDENTS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/RESPONSES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_LIST.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_LIST_NAMES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_PLATE_ADDRESSES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/STORAGE_UNITS.csv'
it seems correct. I copy and paste it into a code :
 rr     <- read.csv( 'w:/project/_Joe.B/Oracle/data/RESPONDENTS.csv')
and it works fine.
Can someone help me debug where is the problem in my for loop code?
Thanks,
Kai





        [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error message from read.csv in loop

2021-07-09 Thread Kai Yang via R-help
Hello List,
I use for loop to read csv difference file into data frame rr.  The data frame 
rr will be deleted after a comparison and go to the next csv file.  Below is my 
code:
for (j in 1:nrow(ora))
{
  mycol  <- ora[j,"fname"]
  mycsv  <- paste0(mycol,".csv'")
  rdcsv  <- noquote(paste0("'w:/project/_Joe.B/Oracle/data/", mycsv))
  rr     <- read.csv(rdcsv)
}
but when I run this code, I got error message below:
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file ''w:/project/_Joe.B/Oracle/data/ASSAY_DEFINITIONS.csv'': No 
such file or directory

so, I checked the rdcsv and print it out, see below:
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_DISCRETE_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_QUESTIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_RUNS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/DATA_ENTRY_PAGES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/DISCRETE_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ENTRY_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_CODELIST_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_CODELIST_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_LOT_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_SAMPLES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/MOLECULAR_WAREHOUSE.csv'
[1] 'w:/project/_Joe.B/Oracle/data/QUESTION_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/QUESTION_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/RESPONDENTS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/RESPONSES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_LIST.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_LIST_NAMES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_PLATE_ADDRESSES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/STORAGE_UNITS.csv'
it seems correct. I copy and paste it into a code :
 rr     <- read.csv( 'w:/project/_Joe.B/Oracle/data/RESPONDENTS.csv')
and it works fine.
Can someone help me debug where is the problem in my for loop code?
Thanks,
Kai





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem for strsplit function

2021-07-08 Thread Kai Yang via R-help
 Hello all,
I have to learning R from beginning, since my group will get rid of SAS. So, my 
question may not be very clear for professional R user. I always dealing with 
column in data frame, not data vector. 
Many thanks to Greg's example. it is very helpful.
one more question, how can I know if the function is for column manipulations 
or for vector?
Thank you,
Kai
On Wednesday, July 7, 2021, 10:36:53 PM PDT, Greg Minshall 
 wrote:  
 
 > sub( "\\.[^.]*$", "", fname )

fwiw, i almost always use '[.]' in preference to '.', as it
seems to be more likely to get through the various levels of quoting in
different contexts.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem for strsplit function

2021-07-07 Thread Kai Yang via R-help
 Hello List,
I have a  one column data frame to store file name with extension. I want to 
create new column to keep file name only without extension.
I tried to use strsplit("name1.csv", "\\.")[[1]] to do that, but it just retain 
the first row only and it is a vector.  how can do this for all of rows and put 
it into a new column?
thank you,
Kai




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Function question, (repost to fix the messy work format)

2021-07-02 Thread Kai Yang via R-help
 Hi Eric,
Thank you spent time to help me for this.

Here is the thing: I was requested to manage a sql server for my group. the 
server has many schemas and the tables (>200). I use ODBC to connect the server 
and get the schema name + table name into a data frame.

For each of schema + table on server, I need to run a summary report. So I 
wrote a summary script like this:

res <- dbGetQuery(con, "SELECT * FROM BIODBX.MECCUNIQUE2")
view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.BIODBX.MECCUNIQUE2.html")
rm(res)

the script works well. but I don't want to write 200+ times of the script to 
summary each table. So, I'm trying to write the function to do this. this is my 
goal.

First of all, I'm not sure if this is the right way to do the summary report, 
because I'm a new R user. So please correct me if my idea is doable.

Second, would you please tell me what is "more detail" information do you need?

Thank you,
Kai


On Friday, July 2, 2021, 12:31:17 PM PDT, Eric Berger 
 wrote:  
 
 Hard for me to tell without more details but it looks like the following has 
several bugs
for (i in dbtable$Tot_table){
  Tabname <- as.character(sqldf(sprintf("SELECT Tot_table FROM dbtable", i)))
  summ(Tabname)
}

Your sprintf() statement seems to use 'i' but actually does not.You probably 
want to rewrite/rearrange this code. More like
x <- sqldf("SELECT Tot_table FROM dbtable")for ( Tabname in x )summ(Tabname)
no doubt this is wrong but put a browser() call after the x <- sqldf(...)line 
and inspect x and go from there



On Fri, Jul 2, 2021 at 10:20 PM Kai Yang  wrote:

 Hello Eric,
Following your suggestion, I modified the code as:
summ <- function(Tabname){
  query <- sprintf(" SELECT * FROM %s",Tabname)
  res <- dbGetQuery(con, query)
  view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")
  rm(res)
}

for (i in dbtable$Tot_table)
{
  Tabname <- as.character(sqldf(sprintf("SELECT Tot_table FROM dbtable", i)))
  summ(Tabname)
}
after submitted the work, I got the error message below:

 Error: nanodbc/nanodbc.cpp:1655: 42000: [Microsoft][ODBC Driver 17 for SQL 
Server][SQL Server]Invalid object name 'c'.  [Microsoft][ODBC Driver 17 for SQL 
Server][SQL Server]Statement(s) could not be prepared.  ' SELECT * FROM 
c("BIODBX.MECCUNIQUE2", "BIODBX.QDATA_HTML_DUMMY", "BIODBX.SET_ITEMS", 
"BIODBX.SET_NAMES", "dbo.sysdiagrams", "GEMD.ASSAY_DEFINITIONS", 
"GEMD.ASSAY_DISCRETE_VALUES", "GEMD.ASSAY_QUESTIONS", "GEMD.ASSAY_RUNS", 
"GEMD.BIODBX_DATABASE_SEED", "GEMD.BIODBX_USER_SEEDS", "GEMD.BIODBX_USERS", 
"GEMD.DATA_ENTRY_PAGES", "GEMD.DISC_SESSION_QID", "GEMD.DISC_SESSION_STATUS", 
"GEMD.DISC_SESSION_TYPE", "GEMD.DISCREPANCIES", "GEMD.DISCREPANCY_QUERY_TEMP", 
"GEMD.DISCRETE_VALUES", "GEMD.ENTERED_DATA_ENTRY_PAGES", "GEMD.ENTRY_GROUPS", 
"GEMD.ExportSampleListNames", "GEMD.FORM_STATUS_BY_SUBJECT", 
"GEMD.GEMD_CODELIST_GROUPS", "GEMD.GEMD_CODELIST_VALUES", 
"GEMD.GEMD_LOT_DEFINITIONS", "GEMD.GEMD_SAMPLES", "GEMD.GEMD_STUDIES", 
"GEMD.MECCUNIQUE", "GEMD.MECCUNIQUE2", "GEMD.MISSING_DI 

One more question,  in the code of "view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")",
can Tabname part be replacted automatic also? 
Thank you,
KaiOn Friday, July 2, 2021, 12:06:12 PM PDT, Eric Berger 
 wrote:  
 
 Modify the summ() function to start like this 
summ <- function(Tabname){   query <- sprintf("SELECT * FROM %s",Tabname)
  res <- dbGetQuery(con, query)
 
etc
HTH,Eric
On Fri, Jul 2, 2021 at 9:39 PM Kai Yang via R-help  wrote:

Hello List,

The previous post look massy. I repost my question. Sorry,


I need to generate summary report for many tables (>200 tables). For each 
table, I can use the script to generate report:
res <- dbGetQuery(con, "SELECT * FROM BIODBX.MECCUNIQUE2")
view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.BIODBX.MECCUNIQUE2.html")
rm(res)
BIODBX.MECCUNIQUE2 is the name of table.

I have all of tables' name in a data frame. So, I'm trying to write a function 
to do this:
summ <- function(Tabname){
  res <- dbGetQuery(con, "SELECT * FROM Tabname")
  view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")
  rm(res)
}
for (i in dbtable$Tot_table)
{
  Tabname <- as.character(sqldf(sprintf("SELECT Tot_table FROM dbtable", i)))
  summ(Tabname)
}

1. I created  a function summ, the argument is Tabname. I put the Tabname in 
the function. I hope it can be replace

Re: [R] R Function question, (repost to fix the messy work format)

2021-07-02 Thread Kai Yang via R-help
 Hello Eric,
Following your suggestion, I modified the code as:
summ <- function(Tabname){
  query <- sprintf(" SELECT * FROM %s",Tabname)
  res <- dbGetQuery(con, query)
  view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")
  rm(res)
}

for (i in dbtable$Tot_table)
{
  Tabname <- as.character(sqldf(sprintf("SELECT Tot_table FROM dbtable", i)))
  summ(Tabname)
}
after submitted the work, I got the error message below:

 Error: nanodbc/nanodbc.cpp:1655: 42000: [Microsoft][ODBC Driver 17 for SQL 
Server][SQL Server]Invalid object name 'c'.  [Microsoft][ODBC Driver 17 for SQL 
Server][SQL Server]Statement(s) could not be prepared.  ' SELECT * FROM 
c("BIODBX.MECCUNIQUE2", "BIODBX.QDATA_HTML_DUMMY", "BIODBX.SET_ITEMS", 
"BIODBX.SET_NAMES", "dbo.sysdiagrams", "GEMD.ASSAY_DEFINITIONS", 
"GEMD.ASSAY_DISCRETE_VALUES", "GEMD.ASSAY_QUESTIONS", "GEMD.ASSAY_RUNS", 
"GEMD.BIODBX_DATABASE_SEED", "GEMD.BIODBX_USER_SEEDS", "GEMD.BIODBX_USERS", 
"GEMD.DATA_ENTRY_PAGES", "GEMD.DISC_SESSION_QID", "GEMD.DISC_SESSION_STATUS", 
"GEMD.DISC_SESSION_TYPE", "GEMD.DISCREPANCIES", "GEMD.DISCREPANCY_QUERY_TEMP", 
"GEMD.DISCRETE_VALUES", "GEMD.ENTERED_DATA_ENTRY_PAGES", "GEMD.ENTRY_GROUPS", 
"GEMD.ExportSampleListNames", "GEMD.FORM_STATUS_BY_SUBJECT", 
"GEMD.GEMD_CODELIST_GROUPS", "GEMD.GEMD_CODELIST_VALUES", 
"GEMD.GEMD_LOT_DEFINITIONS", "GEMD.GEMD_SAMPLES", "GEMD.GEMD_STUDIES", 
"GEMD.MECCUNIQUE", "GEMD.MECCUNIQUE2", "GEMD.MISSING_DI 

One more question,  in the code of "view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")",
can Tabname part be replacted automatic also? 
Thank you,
KaiOn Friday, July 2, 2021, 12:06:12 PM PDT, Eric Berger 
 wrote:  
 
 Modify the summ() function to start like this 
summ <- function(Tabname){   query <- sprintf("SELECT * FROM %s",Tabname)
  res <- dbGetQuery(con, query)
 
etc
HTH,Eric
On Fri, Jul 2, 2021 at 9:39 PM Kai Yang via R-help  wrote:

Hello List,

The previous post look massy. I repost my question. Sorry,


I need to generate summary report for many tables (>200 tables). For each 
table, I can use the script to generate report:
res <- dbGetQuery(con, "SELECT * FROM BIODBX.MECCUNIQUE2")
view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.BIODBX.MECCUNIQUE2.html")
rm(res)
BIODBX.MECCUNIQUE2 is the name of table.

I have all of tables' name in a data frame. So, I'm trying to write a function 
to do this:
summ <- function(Tabname){
  res <- dbGetQuery(con, "SELECT * FROM Tabname")
  view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")
  rm(res)
}
for (i in dbtable$Tot_table)
{
  Tabname <- as.character(sqldf(sprintf("SELECT Tot_table FROM dbtable", i)))
  summ(Tabname)
}

1. I created  a function summ, the argument is Tabname. I put the Tabname in 
the function. I hope it can be replaced one by one
2. the table dbtable contents all tables' name (>200 rows), the field name is 
Tot_table
3. I want use "for" to establish a loop, which can automatic generate a summary 
report for each table

but I got error message below:
 Error: nanodbc/nanodbc.cpp:1655: 42000: [Microsoft][ODBC Driver 17 for SQL 
Server][SQL Server]Invalid object name 'Tabname'.  
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Statement(s) could not be 
prepared. 

 'SELECT * FROM Tabname' 
10. stop(structure(list(message = "nanodbc/nanodbc.cpp:1655: 42000: 
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid 
object name 'Tabname'.  [Microsoft][ODBC Driver 17 for SQL Server][SQL 
Server]Statement(s) could not be prepared. \n 
'SELECT * FROM Tabname'", 
    call = NULL, cppstack = NULL), class = c("odbc::odbc_error", 
"C++Error", "error", "condition"))) 
9.new_result(connection@ptr, statement, immediate) 
8.OdbcResult(connection = conn, statement = statement, params = params,     
immediate = immediate) 
7..local(conn, statement, ...) 
6.dbSendQuery(conn, statement, params = params, ...) 
5.dbSendQuery(conn, statement, params = params, ...) 
4..local(conn, statement, ...) 
3.dbGetQuery(con, "SELECT * FROM Tabname") 
2.dbGetQuery(con, "SELECT * FROM Tabname") 
1.summ(Tabname) 

it seems the tables' name is not successfully pass into query. can someone give 
me an instruction for this?
many thanks,
Kai


        [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Function question, (repost to fix the messy work format)

2021-07-02 Thread Kai Yang via R-help
Hello List,

The previous post look massy. I repost my question. Sorry,


I need to generate summary report for many tables (>200 tables). For each 
table, I can use the script to generate report:
res <- dbGetQuery(con, "SELECT * FROM BIODBX.MECCUNIQUE2")
view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.BIODBX.MECCUNIQUE2.html")
rm(res)
BIODBX.MECCUNIQUE2 is the name of table.

I have all of tables' name in a data frame. So, I'm trying to write a function 
to do this:
summ <- function(Tabname){
  res <- dbGetQuery(con, "SELECT * FROM Tabname")
  view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")
  rm(res)
}
for (i in dbtable$Tot_table)
{
  Tabname <- as.character(sqldf(sprintf("SELECT Tot_table FROM dbtable", i)))
  summ(Tabname)
}

1. I created  a function summ, the argument is Tabname. I put the Tabname in 
the function. I hope it can be replaced one by one
2. the table dbtable contents all tables' name (>200 rows), the field name is 
Tot_table
3. I want use "for" to establish a loop, which can automatic generate a summary 
report for each table

but I got error message below:
 Error: nanodbc/nanodbc.cpp:1655: 42000: [Microsoft][ODBC Driver 17 for SQL 
Server][SQL Server]Invalid object name 'Tabname'.  
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Statement(s) could not be 
prepared. 

 'SELECT * FROM Tabname' 
10. stop(structure(list(message = "nanodbc/nanodbc.cpp:1655: 42000: 
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid 
object name 'Tabname'.  [Microsoft][ODBC Driver 17 for SQL Server][SQL 
Server]Statement(s) could not be prepared. \n 
'SELECT * FROM Tabname'", 
    call = NULL, cppstack = NULL), class = c("odbc::odbc_error", 
"C++Error", "error", "condition"))) 
9.new_result(connection@ptr, statement, immediate) 
8.OdbcResult(connection = conn, statement = statement, params = params,     
immediate = immediate) 
7..local(conn, statement, ...) 
6.dbSendQuery(conn, statement, params = params, ...) 
5.dbSendQuery(conn, statement, params = params, ...) 
4..local(conn, statement, ...) 
3.dbGetQuery(con, "SELECT * FROM Tabname") 
2.dbGetQuery(con, "SELECT * FROM Tabname") 
1.summ(Tabname) 

it seems the tables' name is not successfully pass into query. can someone give 
me an instruction for this?
many thanks,
Kai


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R function question

2021-07-02 Thread Kai Yang via R-help
Hello List,I need to generate summary report for many tables (>200 tables). For 
each table, I can use the script to generate repost:
res <- dbGetQuery(con, "SELECT * FROM BIODBX.MECCUNIQUE2")view(dfSummary(res), 
file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.BIODBX.MECCUNIQUE2.html")rm(res)
BIODBX.MECCUNIQUE2 is the name of table.
I have all of tables' name in a data frame. So, I'm trying to write a function 
to do this:
summ <- function(Tabname){  res <- dbGetQuery(con, "SELECT * FROM Tabname")  
view(dfSummary(res), file = 
"W:/project/_Joe.B/MSSQL/try/summarytools.Tabname.html")  rm(res)}
for (i in dbtable$Tot_table){  Tabname <- as.character(sqldf(sprintf("SELECT 
Tot_table FROM dbtable", i)))  summ(Tabname)}
1. I created  a function summ, the argument is Tabname. I put the Tabname in 
the function. I hope it can be replaced one by one2. the table dbtable contents 
all tables' name (>200 rows), the field name is Tot_table3. I want use "for" to 
establish a loop, which can automatic generate a summary report for each table
but I got error message below: Error: nanodbc/nanodbc.cpp:1655: 42000: 
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name 
'Tabname'.  [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Statement(s) 
could not be prepared.  'SELECT * FROM Tabname' 10. 
stop(structure(list(message = "nanodbc/nanodbc.cpp:1655: 42000: 
[Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Invalid object name 
'Tabname'.  [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Statement(s) 
could not be prepared. \n 'SELECT * FROM Tabname'",     call = NULL, 
cppstack = NULL), class = c("odbc::odbc_error", "C++Error", "error", 
"condition"))) 9.new_result(connection@ptr, statement, immediate) 
8.OdbcResult(connection = conn, statement = statement, params = params,     
immediate = immediate) 7..local(conn, statement, ...) 6.dbSendQuery(conn, 
statement, params = params, ...) 5.dbSendQuery(conn, statement, params = 
params, ...) 4..local(conn, statement, ...) 3.dbGetQuery(con, "SELECT * FROM 
Tabname") 2.dbGetQuery(con, "SELECT * FROM Tabname") 1.summ(Tabname) 
it seems the tables' name is not successfully pass into query. can someone give 
me an instruction for this?
many thanks,Kai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] get data frame using DBI package

2021-07-01 Thread Kai Yang via R-help
Hi List,I use odbc to connect a MSSQL server. When I run the script of "res <- 
dbSendQuery(con, "SELECT * FROM BIODBX.MECCUNIQUE2")", the res is "Formal class 
OdbcResult". Can someone help me to modify the code to get a data 
frame?Thanks,Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] trim NA from concatenate result

2021-06-02 Thread Kai Yang via R-help
Hi Rui,
I use the code to fix my problem:
try$newcol <- gsub(" NA", "", try$newcol)

But, I'll try your solution later.
Thank you for your help.
Kai   On Wednesday, June 2, 2021, 02:08:18 PM PDT, Kai Yang via R-help 
 wrote:  
 
 Hi List,
I use paste function to concatenate  3 character columns together.
when I run table to see that, I found 3 categories. How can I write script to 
trim NA in 2nd and 3rd group and set the first one as NA?
Thanks,
Kai
NA NA NA 
NA NA Adenocarcinoma
NA Other NA

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] trim NA from concatenate result

2021-06-02 Thread Kai Yang via R-help
Hi List,
I use paste function to concatenate  3 character columns together.
when I run table to see that, I found 3 categories. How can I write script to 
trim NA in 2nd and 3rd group and set the first one as NA?
Thanks,
Kai
NA NA NA 
NA NA Adenocarcinoma
NA Other NA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace NA into - for specific column

2021-06-02 Thread Kai Yang via R-help
 Hello Jim,Your example works well.Thank you for your help,Kai
On Tuesday, June 1, 2021, 04:59:09 PM PDT, Kai Yang via R-help 
 wrote:  
 
 Hi List,
I have a column MMR in a data frame proband_crc2. The column contents missing 
value and + (means positive). 
I need to replace all missing value into - (means negative)
I did try with difference ways.  Below are my testing, but not any one works. 
Can someone help me?
Thanks,
Kai
proband_crc2 %>% mutate(MMR=recode(MMR, '' = "-"))
proband_crc2            <- data.frame (ifelse(proband_crc2$MMR !="+", "-", NA))
proband_crc2$MMR <- ifelse(MMR %in% c(""," ","-"), NA, MMR)
proband_crc2$MMR[proband_crc2$MMR==NA] <- "-"
proband_crc2           <- data.frame( proband_crc2 %>% mutate(across(c("MMR"), 
~ifelse(.==NA, "-", as.character(.)


    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] replace NA into - for specific column

2021-06-01 Thread Kai Yang via R-help
Hi List,
I have a column MMR in a data frame proband_crc2. The column contents missing 
value and + (means positive). 
I need to replace all missing value into - (means negative)
I did try with difference ways.  Below are my testing, but not any one works. 
Can someone help me?
Thanks,
Kai
proband_crc2 %>% mutate(MMR=recode(MMR, '' = "-"))
proband_crc2            <- data.frame (ifelse(proband_crc2$MMR !="+", "-", NA))
proband_crc2$MMR <- ifelse(MMR %in% c(""," ","-"), NA, MMR)
proband_crc2$MMR[proband_crc2$MMR==NA] <- "-"
proband_crc2           <- data.frame( proband_crc2 %>% mutate(across(c("MMR"), 
~ifelse(.==NA, "-", as.character(.)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] if statement and for loop question

2021-05-31 Thread Kai Yang via R-help
  Hi Jim,
Sorry to post "same" question, because 
1. I was asking to use plain text format. I have to post my question again. But 
I don't know if it is working.
2. I'm a beginner for R (< 2 month). It may not easy for me to ask a "clear" R 
question. My current work is to transfer my SAS code into R, especially for 
data manipulation part. 
I'll do my best to ask a non."same" question later.
Thanks,
Kai
On Sunday, May 30, 2021, 10:44:41 PM PDT, Jim Lemon  
wrote:  
 
 Hi Kai,
You seem to be asking the same question again and again. This does not
give us the warm feeling that you know what you want.

testdf<-data.frame(a=c("Negative","Positive","Neutral","Random","VUS"),
 b=c("No","Yes","No","Maybe","Yes"),
 c=c("Off","On","Off","Off","On"),
 d=c("Bad","Good","Bad","Bad","Good"),
 stringsAsFactors=FALSE)
testdf
match_strings<-c("Positive","VUS")
testdf$b<-ifelse(testdf$a %in% match_strings,testdf$b,"")
testdf$c<-ifelse(testdf$a %in% match_strings,testdf$c,"")
testdf$d<-ifelse(testdf$a %in% match_strings,testdf$d,"")
testdf

I have assumed that you mean "zero length strings" rather than
"zeros". Also note that your initial code was producing logical values
that were never assigned to anything.

Jim

On Mon, May 31, 2021 at 2:29 AM Kai Yang via R-help
 wrote:
>
> Hello List,I have a data frame which having the character columns:
>
> | a1 | b1 | c1 | d1 |
> | a2 | b2 | c2 | d2 |
> | a3 | b3 | c3 | d3 |
> | a4 | b4 | c4 | d4 |
> | a5 | b5 | c5 | d5 |
>
>
>
> I need to do: if a1 not = "Positive" and not = "VUS" then values of  b1, c1 
> and d1 will be zero out. And do the same thing for the a2 to a5 series.
> I write the code below to do this. But it doesn't work. Would you please 
> correct my code?
> Thank you,
> Kai
>
>
> for (i in 1:5)
> {
>  if (isTRUE(try$a[i] != "Positive" && try$a[i] != "VUS"))
>  {
>    try$b[i]== ''
>    try$c[i] == ''
>    try$d[i]== ''
>  }
> }
>
>
>        [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] if statement and for loop question

2021-05-30 Thread Kai Yang via R-help
Hello List,I have a data frame which having the character columns:

| a1 | b1 | c1 | d1 |
| a2 | b2 | c2 | d2 |
| a3 | b3 | c3 | d3 |
| a4 | b4 | c4 | d4 |
| a5 | b5 | c5 | d5 |



I need to do: if a1 not = "Positive" and not = "VUS" then values of  b1, c1 and 
d1 will be zero out. And do the same thing for the a2 to a5 series.
I write the code below to do this. But it doesn't work. Would you please 
correct my code?
Thank you,
Kai


for (i in 1:5) 
{
  if (isTRUE(try$a[i] != "Positive" && try$a[i] != "VUS"))
  {
    try$b[i]== ''
    try$c[i] == ''
    try$d[i]== ''
  }
}


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R grep question

2021-05-27 Thread Kai Yang via R-help
 Hi Rui,thank you for your suggestion. 
but when I try the solution, I got message below:

Error in "MLH1" | "MSH2" :   operations are possible only for numeric, logical 
or complex types

does it mean, grepl can not work on character field?
Thanks,KaiOn Thursday, May 27, 2021, 01:37:58 AM PDT, Rui Barradas 
 wrote:  
 
 Hello,

ifelse needs a logical condition, not the value. Try grepl.


CRC$MMR.gene <- ifelse(grepl("MLH1"|"MSH2",CRC$gene.all), "Yes", "No")


Hope this helps,

Rui Barradas

Às 05:29 de 27/05/21, Kai Yang via R-help escreveu:
> Hi List,
> I wrote the code to create a new variable:
> CRC$MMR.gene<-ifelse(grep("MLH1"|"MSH2",CRC$gene.all,value=T),"Yes","No")
>  
> 
> I need to create MMR.gene column in CRC data frame, ifgene.all column 
> contenes MLH1 or MSH2, then the MMR.gene=Yes, if not,MMR.gene=No
> 
> But, the code doesn't work for me. Can anyone tell how to fix the code?
> 
> Thank you,
> 
> Kai
>     [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R grep question

2021-05-26 Thread Kai Yang via R-help
Hi List,
I wrote the code to create a new variable:
CRC$MMR.gene<-ifelse(grep("MLH1"|"MSH2",CRC$gene.all,value=T),"Yes","No")
 

I need to create MMR.gene column in CRC data frame, ifgene.all column contenes 
MLH1 or MSH2, then the MMR.gene=Yes, if not,MMR.gene=No

But, the code doesn't work for me. Can anyone tell how to fix the code?

Thank you,

Kai
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help to correct the function problem

2021-05-24 Thread Kai Yang via R-help
Hello list,I want to translate some of the R code into function, but I got 
error message. I'm new for R. please point out where is my problem.thank you,Kai
the original working code:p1 <- select(raw             ,Pedigree.name           
  ,UPN              ,Test.Result.tr_Test.Result1             
,Test.Result.tr_gene1             ,Test.Result.tr_Variant..nucleotide.1  )p2 <- 
select(raw             ,Pedigree.name             ,UPN              
,Test.Result.tr_Test.Result2             ,Test.Result.tr_gene2             
,Test.Result.tr_Variant..nucleotide.2)p3 <- select(raw             
,Pedigree.name             ,UPN              ,Test.Result.tr_Test.Result3       
      ,Test.Result.tr_gene3             
,Test.Result.tr_Variant..nucleotide.3)p4 <- select(raw             
,Pedigree.name             ,UPN              ,Test.Result.tr_Test.Result4       
      ,Test.Result.tr_gene4             
,Test.Result.tr_Variant..nucleotide.4)p5 <- select(raw             
,Pedigree.name             ,UPN              ,Test.Result.tr_Test.Result5       
      ,Test.Result.tr_gene5             ,Test.Result.tr_Variant..nucleotide.5)

I tried to write a function to do this:k_subset <- function(pp, aa, bb, cc){  
pp <- substitute(pp)  aa <- substitute(aa)    bb <- substitute(bb)    cc <- 
substitute(cc)    pp  <- select(raw                ,Pedigree.name               
 ,UPN                 ,aa                ,bb                ,cc  )  
}k_subset(p1, Test.Result.tr_Test.Result1, Result.tr_gene1, 
Test.Result.tr_Variant..nucleotide.1 )but I got error message: Note: Using an 
external vector in selections is ambiguous.i Use `all_of(aa)` instead of `aa` 
to silence this message.i See 
.This message 
is displayed once per session.Note: Using an external vector in selections is 
ambiguous.i Use `all_of(bb)` instead of `bb` to silence this message.i See 
.This message 
is displayed once per session. Error: Can't subset columns that don't existx 
Column `Result.tr_gene1` doesn't exist.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create a function problem

2021-05-14 Thread Kai Yang via R-help
 Hi Rolf,
I am a beginner for R. 
I have a date frame raw. it contents the fields of pedigree.name, UPN, 
Test.Result.tr_Test.Result1, Result.tr_gene1, 
Test.Result.tr_Variant..nucleotide.1 .. Test.Result.tr_Test.Result20, 
Result.tr_gene20, Test.Result.tr_Variant..nucleotide.20
Basically, I want transpose the data frame from wide format into long format. 
So, I hope the function can generate subset the those fields for 20 times, 
rename them and then stack them into one long format data frame. After that, I 
hope I can use "for" loop to do this.
And now, I don't know how to fix the error
Thank you,Kai




On Friday, May 14, 2021, 05:38:18 PM PDT, Rolf Turner 
 wrote:  
 
 
On Fri, 14 May 2021 17:42:12 +0000 (UTC)
Kai Yang via R-help  wrote:

> Hello List, I was trying to write a function. But I got a error
> message. Can someone help me how to fix it? Many thanks,Kai
> > k_subset <- function(p, a, b, c){
> +   p  <- select(raw
> +                ,Pedigree.name
> +                ,UPN
> +                ,a
> +                ,b
> +                ,c
> +   )  
> + }
> > k_subset(p1, Test.Result.tr_Test.Result1, Result.tr_gene1,
> > Test.Result.tr_Variant..nucleotide.1 )
>  Error: object 'Test.Result.tr_Test.Result1' not found

I would have thought the error message to be completely
self-explanatory.  The object in question cannot be found.  I.e. it
does not exist, in your workspace or in any of the data bases on your
search path.

It would appear that you have not created "Test.Result.tr_Test.Result1".
Why did you expect it to be present?

Moreover, the code of your function makes no sense at all, at least not
to *my* feeble brain.  The quantities "raw", "Pedigree.name" and "UPN"
are not arguments of your function.  How do you expect k_subset() to
know what they are?

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Create a function problem

2021-05-14 Thread Kai Yang via R-help
Hello List, I was trying to write a function. But I got a error message. Can 
someone help me how to fix it?
Many thanks,Kai
> k_subset <- function(p, a, b, c){
+   p  <- select(raw
+                ,Pedigree.name
+                ,UPN
+                ,a
+                ,b
+                ,c
+   )  
+ }
> k_subset(p1, Test.Result.tr_Test.Result1, Result.tr_gene1, 
> Test.Result.tr_Variant..nucleotide.1 )
 Error: object 'Test.Result.tr_Test.Result1' not found

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.