Re: [Rd] read.csv

2024-04-16 Thread Reed A. Cartwright
Gene names being misinterpreted by spreadsheet software (read.csv is
no different) is a classic issue in bioinformatics. It seems like
every practitioner ends up encountering this issue in due time. E.g.

https://pubmed.ncbi.nlm.nih.gov/15214961/

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-1044-7

https://www.nature.com/articles/d41586-021-02211-4

https://www.theverge.com/2020/8/6/21355674/human-genes-rename-microsoft-excel-misreading-dates


On Tue, Apr 16, 2024 at 3:46 AM jing hua zhao  wrote:
>
> Dear R-developers,
>
> I came to a somewhat unexpected behaviour of read.csv() which is trivial but 
> worthwhile to note -- my data involves a protein named "1433E" but to save 
> space I drop the quote so it becomes,
>
> Gene,SNP,prot,log10p
> YWHAE,13:62129097_C_T,1433E,7.35
> YWHAE,4:72617557_T_TA,1433E,7.73
>
> Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly 
> confused by scientific notation) numeric 1433 which only alerts me when I 
> tried to combine data,
>
> all_data <- data.frame()
> for (protein in proteins[1:7])
> {
>cat(protein,":\n")
>f <- paste0(protein,".csv")
>if(file.exists(f))
>{
>  p <- read.csv(f)
>  print(p)
>  if(nrow(p)>0) all_data  <- bind_rows(all_data,p)
>}
> }
>
> proteins[1:7]
> [1] "1433B" "1433E" "1433F" "1433G" "1433S" "1433T" "1433Z"
>
> dplyr::bind_rows() failed to work due to incompatible types nevertheless 
> rbind() went ahead without warnings.
>
> Best wishes,
>
>
> Jing Hua
>
> __
> R-devel@r-project.org mailing list
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-devel__;!!IKRxdwAv5BmarQ!YJzURlAK1O3rlvXvq9xl99aUaYL5iKm9gnN5RBi-WJtWa5IEtodN3vaN9pCvRTZA23dZyfrVD7X8nlYUk7S1AK893A$

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] Request to Retire COHCAP Bioconductor Package

2024-04-16 Thread Charles Warden via Bioc-devel
Hi Lori,

Thank you very much for your help.

As long as it won't be a problem that I don't think my COH e-mail address will 
not be able to receive messages after 4/30, then I agree to direct removal.  I 
believe this is what has been discussed with Yate-Ching and Xiwei, in terms of 
the general topic.

If I understand correctly, then I appreciate you helping skip a release in the 
depreciation state (prior to removal).

Thank you again!

Sincerely,
Charles

From: Kern, Lori 
Sent: Tuesday, April 16, 2024 3:56 AM
To: bioc-devel@r-project.org ; Charles Warden 

Cc: Xiwei Wu ; Yate-Ching Yuan 
Subject: Re: Request to Retire COHCAP Bioconductor Package

We normally like to do a full round of a package in a deprecation state before 
removal completely (so deprecated in Bioc 3. 19 and removed in Bioc 3. 20) but 
since there are no reverse dependencies, I don't think this should be an issue 
to directly

We normally like to do a full round of a package in a deprecation state before 
removal completely (so deprecated in Bioc 3.19 and removed in Bioc 3.20) but 
since there are no reverse dependencies, I don't think this should be an issue 
to directly remove from 3.19 without deprecation.

Please confirm direct removal over standard deprecation processes and I can 
take care of this.




Lori Shepherd - Kern

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Charles Warden 
via Bioc-devel 
Sent: Monday, April 15, 2024 7:14 PM
To: bioc-devel@r-project.org 
Cc: Xiwei Wu ; Yate-Ching Yuan 
Subject: [Bioc-devel] Request to Retire COHCAP Bioconductor Package

Hi,

I have greatly appreciated the assistance provided by Bioconductor support over 
several years!

However, my requested end date to work in the Integrative Genomics Core at City 
of Hope is 4/30/2024.

Later this year, I will be starting a PhD program at UC-Riverside.  When I was 
previously in graduate school at the University of Michigan, I switched to 
using my personal e-mail address to support the COHCAP Bioconductor package.  
However, we would prefer to improve the transition this time that I am going to 
graduate school.

So, after a number of discussions, we have decided to request to retire the 
COHCAP Bioconductor package 
(https://secure-web.cisco.com/18MW32Nmr7CBKJg3Ep7wXuI8t3lFqrL7diwmsQYI7dBiV2DnODHyT7MEKyQI_XbTm9SxFc-X59sAg_eJO5wjVcXo1_bK66HNtSdjt5vNojTo5hKNWi-Dl8DW2oDONNXjUjpvl9NTJOO3l5YjYsbzM3Nl2bQvljjnk3VZeY1qwvpf7cG1BJhQHY3RL7WpoFTiBYyCxK8HqpNz1dhxGBJOTBFDm0w-8leXCP7NCojJat40_hGb6YIofwnhi39okN_Clxzp0gzb8FQRap2AT3oyNiEZKelvaNxWXUmh2abcTYFawXskGOEAWZXfQv5BTmpkp/https%3A%2F%2Fwww.bioconductor.org%2Fpackages%2Frelease%2Fbioc%2Fhtml%2FCOHCAP.html).

We will instead request users to go to the source page on GitHub 
(https://secure-web.cisco.com/1j9Obf_O7HxIn__GrQM5CQS_vyKFobzQGg5-jot9RzFCjkUGpO1HaL4J4Adtt605Oq38PACHleEPSjAb5snp2drWgmWmB6FxuGzqlL9OWz2QFJxu1TKALIhHYujdQJS-2mP2a9XdUEB-_sp4ipEwnETlWLuOI9ExJulZ2LNOQ-fC7N1ZUTo29RGbD-fFRI-MXYemgl_nph8dRYz_1TOTLiIBSPeCu42J362tgxREQM3YrbFIkkjD88aM5CGs9UWNI7xNKWB2JucADbBq1n8bu9sec4G5raWiZ4r2zSF4quP3JtCRG8OpUgu_KBca9hWSQ61AXx0yfOKhXzb7mhejUog/https%3A%2F%2Fgithub.com%2Fcwarden45%2FCOHCAP),
 where we have made changes in the README file.

I have cc'd Yate-Ching Yuan (the Bioinformatics Director, at the time of 
development of the original code) as well as Xiwei Wu (the Integrative Genomics 
Core Director, where I currently work and where I was working when the code was 
first converted to be provided through Bioconductor).

Since we won't be making changes for future formatting requirements and we 
don't want to use a non-COH e-mail address as a support contact, can you please 

Re: [Rd] read.csv

2024-04-16 Thread Ben Bolker
  Tangentially, your code will be more efficient if you add the data 
files to a *list* one by one and then apply bind_rows or 
do.call(rbind,...) after you have accumulated all of the information 
(see chapter 2 of the _R Inferno_). This may or may not be practically 
important in your particular case.


Burns, Patrick. 2012. The R Inferno. Lulu.com. 
http://www.burns-stat.com/pages/Tutor/R_inferno.pdf.



On 2024-04-16 6:46 a.m., jing hua zhao wrote:

Dear R-developers,

I came to a somewhat unexpected behaviour of read.csv() which is trivial but worthwhile 
to note -- my data involves a protein named "1433E" but to save space I drop 
the quote so it becomes,

Gene,SNP,prot,log10p
YWHAE,13:62129097_C_T,1433E,7.35
YWHAE,4:72617557_T_TA,1433E,7.73

Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly 
confused by scientific notation) numeric 1433 which only alerts me when I tried 
to combine data,

all_data <- data.frame()
for (protein in proteins[1:7])
{
cat(protein,":\n")
f <- paste0(protein,".csv")
if(file.exists(f))
{
  p <- read.csv(f)
  print(p)
  if(nrow(p)>0) all_data  <- bind_rows(all_data,p)
}
}

proteins[1:7]
[1] "1433B" "1433E" "1433F" "1433G" "1433S" "1433T" "1433Z"

dplyr::bind_rows() failed to work due to incompatible types nevertheless 
rbind() went ahead without warnings.

Best wishes,


Jing Hua

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] read.csv

2024-04-16 Thread Dirk Eddelbuettel


As an aside, the odd format does not seem to bother data.table::fread() which
also happens to be my personally preferred workhorse for these tasks:

> fname <- "/tmp/r/filename.csv"
> read.csv(fname)
   Gene SNP prot log10p
1 YWHAE 13:62129097_C_T 1433   7.35
2 YWHAE 4:72617557_T_TA 1433   7.73
> data.table::fread(fname)
 Gene SNP   prot log10p

1:  YWHAE 13:62129097_C_T  1433E   7.35
2:  YWHAE 4:72617557_T_TA  1433E   7.73
> readr::read_csv(fname)
Rows: 2 Columns: 4
── Column specification 
──
Delimiter: ","
chr (2): Gene, SNP
dbl (2): prot, log10p

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this 
message.
# A tibble: 2 × 4
  Gene  SNP  prot log10p

1 YWHAE 13:62129097_C_T  1433   7.35
2 YWHAE 4:72617557_T_TA  1433   7.73
> 

That's on Linux, everything current but dev version of data.table.

Dirk

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] read.csv

2024-04-16 Thread Duncan Murdoch

On 16/04/2024 7:36 a.m., Rui Barradas wrote:

Às 11:46 de 16/04/2024, jing hua zhao escreveu:

Dear R-developers,

I came to a somewhat unexpected behaviour of read.csv() which is trivial but worthwhile 
to note -- my data involves a protein named "1433E" but to save space I drop 
the quote so it becomes,

Gene,SNP,prot,log10p
YWHAE,13:62129097_C_T,1433E,7.35
YWHAE,4:72617557_T_TA,1433E,7.73

Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly 
confused by scientific notation) numeric 1433 which only alerts me when I tried 
to combine data,

all_data <- data.frame()
for (protein in proteins[1:7])
{
 cat(protein,":\n")
 f <- paste0(protein,".csv")
 if(file.exists(f))
 {
   p <- read.csv(f)
   print(p)
   if(nrow(p)>0) all_data  <- bind_rows(all_data,p)
 }
}

proteins[1:7]
[1] "1433B" "1433E" "1433F" "1433G" "1433S" "1433T" "1433Z"

dplyr::bind_rows() failed to work due to incompatible types nevertheless 
rbind() went ahead without warnings.

Best wishes,


Jing Hua

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Hello,

I wrote a file with that content and read it back with


read.csv("filename.csv", as.is = TRUE)


There were no problems, it all worked as expected.


What platform are you on?  I got the same output as Jing Hua:

Input filename.csv:

Gene,SNP,prot,log10p
YWHAE,13:62129097_C_T,1433E,7.35
YWHAE,4:72617557_T_TA,1433E,7.73

Output:

> read.csv("filename.csv")
   Gene SNP prot log10p
1 YWHAE 13:62129097_C_T 1433   7.35
2 YWHAE 4:72617557_T_TA 1433   7.73

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] Bioconductor will rename the default branch on `git.bioconductor.org` to `devel`

2024-04-16 Thread Kern, Lori via Bioc-devel
Hello maintainers,

A little over a year ago, Bioconductor moved forward with a default branch 
renaming from 'master' to 'devel' on git.bioconductor.org.  For an easier 
transition we have thus far left the deprecated 'master' branch and if you were 
pushing received a Warning message that this branch was deprecated and to use 
'devel'.  We are planning to officially defunct the use of 'master' shortly 
after the next Bioconductor release.  If you have been delaying the transition 
to using 'devel' we highly encourage you to check out the original 
documentation provided and make the transition as soon as possible. If you have 
any questions or concerns, please reach out on this mailing list or the 
dedicated community slack channel bioc-git.

Helpful resources:
1.  BiocBlog:https://blog.bioconductor.org/posts/2023-03-01-transition-to-devel/
2.  Branch Renaming 
FAQ:https://contributions.bioconductor.org/branch-rename-faqs.html


Cheers,


Lori Shepherd - Kern

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Kern, Lori 

Sent: Wednesday, March 8, 2023 2:29 PM
To: Maria.Doyle ; bioc-devel@r-project.org 

Subject: Re: [Bioc-devel] Bioconductor will rename the default branch on 
`git.bioconductor.org` to `devel`

The Bioconductor Core Team has moved forward with the branch renaming from 
master to devel on git.bioconductor.org .

Please see

  1.  BiocBlog: 
https://secure-web.cisco.com/1viP5nUaE87rXaj1htxbn_Z4JxgvhbJAIp5weR1klW-gCBkH6ZNbpVsyXkW_l2i1dUWO75VXq8QQZawqX4VKORR-vqu_495UbK75DBJrFOSe3nOeXRltHOB5nv-r0XFX80L_gL9dQ1Uddqvmo0pQvOod5c45s870PTKQtzVlvAyF2ZMVWQZQjB9dgXH3j2W5RM4MUGxHcf9au36Jdl0_LZAzAbvQPikYlnRbQCqiDyBMUVINyKUxWZieZCpPgFIM3BzgcqcDPLluoVnUM0kUWFXTagxBIqZhqla1fRaZhIhFi8tCxtMy99hx9_8w3nYV5/https%3A%2F%2Fbioconductor.github.io%2Fbiocblog%2Fposts%2F2023-03-01-transition-to-devel%2F
  2.  Branch Renaming FAQ: 
https://secure-web.cisco.com/1arsDAuTSNTYHN7RN2VpvUx_3T1AN6xSyxddAqqhoglNQBa7DB1d1OBX_GXfzMyL2BBuz4al8m2_b5vIx9xEBq3tWC5-dC_hgKvbo3YrRofZQu3WeNulmit2X6pCN01Nl1QIBzyJ3wfdgEe9zu2cxYxvjTZ0Basya6zJOagcEocKJ3z0_xWoathpHFRdE6Awp6Tx2pygv632MK4Bc4j3bNN4Jhnqqk1wfkdY2yj4RB6LSq81gJtmituiVeuy8V6CVZ_tpdcJ28EiZW1-jW-MDBSkGnO-JV4c5O5QmKcI05FcSKpniaUonXYEh-TYi7ivL/https%3A%2F%2Fcontributions.bioconductor.org%2Fbranch-rename-faqs.html

Anyone having trouble can post to the community-bioc slack  #bioc_git channel  
or on the bioc-devel@r-project.org mailing list

Cheers,


Lori Shepherd - Kern

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Maria.Doyle 

Sent: Thursday, March 2, 2023 4:59 AM
To: bioc-devel@r-project.org 
Subject: [Bioc-devel] Bioconductor will rename the default branch on 
`git.bioconductor.org` to `devel`

On March 8th, the Bioconductor Core Team will rename the default branch on 
`git.bioconductor.org` to `devel`.


For more details, see the

1.Blog 
post

2.Branch Rename 
FAQ





Maria Doyle, PhD
Bioconductor Community Manager

School of Medicine,
University of Limerick, Limerick, V94 T9PX
[I work flexible hours across several time zones. I don't expect you to read or 
respond to my emails outside of your normal working hours]


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://secure-web.cisco.com/1F7pevGjkMogpSja-6sUwwfPrSAU8XzAuCf3wmmDyAEHbadmWSJ9hhSeDYKCo-ZfEXznzhPed-TcoodiBFDfN45y1zCMbzp9XMkXHL08lLy9o96SjNRslLGRrfPVpdOYdPiv-hI1J0MiFKyDyZLOfiftbTqnpjReT3gVSg8dTiZLp5NfcgWwoLjZJBgFRJxpS6ZbrBHm7R-qbeLNTLrIoa8xDdmyGgWFY0LKOXC5jnWOo6b2PdN0jtT-Xghx-znF46TAhi1k-bMEVYubqT-MW8rFVQ2D8JQNwIqaXb7v0lDrWWRRhyig1z7TGZ3gPdPEA/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel



This email message may contain legally privileged and/or confidential 
information.  If you are not the intended 

Re: [Rd] read.csv

2024-04-16 Thread peter dalgaard
Hum...

This boils down to

> as.numeric("1.23e")
[1] 1.23
> as.numeric("1.23e-")
[1] 1.23
> as.numeric("1.23e+")
[1] 1.23

which in turn comes from this code in src/main/util.c (function R_strtod)

if (*p == 'e' || *p == 'E') {
int expsign = 1;
switch(*++p) {
case '-': expsign = -1;
case '+': p++;
default: ;
}
for (n = 0; *p >= '0' && *p <= '9'; p++) n = (n < MAX_EXPONENT_PREFIX) 
? n * 10 + (*p - '0') : n;
expn += expsign * n;
}

which sets the exponent to zero even if the for loop terminates immediately.  

This might qualify as a bug, as it differs from the C function strtod which 
accepts

"A sequence of digits, optionally containing a decimal-point character (.), 
optionally followed by an exponent part (an e or E character followed by an 
optional sign and a sequence of digits)."

[Of course, there would be nothing to stop e.g. "1433E1" from being converted 
to numeric.]

-pd


> On 16 Apr 2024, at 12:46 , jing hua zhao  wrote:
> 
> Dear R-developers,
> 
> I came to a somewhat unexpected behaviour of read.csv() which is trivial but 
> worthwhile to note -- my data involves a protein named "1433E" but to save 
> space I drop the quote so it becomes,
> 
> Gene,SNP,prot,log10p
> YWHAE,13:62129097_C_T,1433E,7.35
> YWHAE,4:72617557_T_TA,1433E,7.73
> 
> Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly 
> confused by scientific notation) numeric 1433 which only alerts me when I 
> tried to combine data,
> 
> all_data <- data.frame()
> for (protein in proteins[1:7])
> {
>   cat(protein,":\n")
>   f <- paste0(protein,".csv")
>   if(file.exists(f))
>   {
> p <- read.csv(f)
> print(p)
> if(nrow(p)>0) all_data  <- bind_rows(all_data,p)
>   }
> }
> 
> proteins[1:7]
> [1] "1433B" "1433E" "1433F" "1433G" "1433S" "1433T" "1433Z"
> 
> dplyr::bind_rows() failed to work due to incompatible types nevertheless 
> rbind() went ahead without warnings.
> 
> Best wishes,
> 
> 
> Jing Hua
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] read.csv

2024-04-16 Thread Rui Barradas

Às 11:46 de 16/04/2024, jing hua zhao escreveu:

Dear R-developers,

I came to a somewhat unexpected behaviour of read.csv() which is trivial but worthwhile 
to note -- my data involves a protein named "1433E" but to save space I drop 
the quote so it becomes,

Gene,SNP,prot,log10p
YWHAE,13:62129097_C_T,1433E,7.35
YWHAE,4:72617557_T_TA,1433E,7.73

Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly 
confused by scientific notation) numeric 1433 which only alerts me when I tried 
to combine data,

all_data <- data.frame()
for (protein in proteins[1:7])
{
cat(protein,":\n")
f <- paste0(protein,".csv")
if(file.exists(f))
{
  p <- read.csv(f)
  print(p)
  if(nrow(p)>0) all_data  <- bind_rows(all_data,p)
}
}

proteins[1:7]
[1] "1433B" "1433E" "1433F" "1433G" "1433S" "1433T" "1433Z"

dplyr::bind_rows() failed to work due to incompatible types nevertheless 
rbind() went ahead without warnings.

Best wishes,


Jing Hua

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Hello,

I wrote a file with that content and read it back with


read.csv("filename.csv", as.is = TRUE)


There were no problems, it all worked as expected.

Hope this helps,

Rui Barradas




--
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença 
de vírus.
www.avg.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] Request to Retire COHCAP Bioconductor Package

2024-04-16 Thread Kern, Lori via Bioc-devel
We normally like to do a full round of a package in a deprecation state before 
removal completely (so deprecated in Bioc 3.19 and removed in Bioc 3.20) but 
since there are no reverse dependencies, I don't think this should be an issue 
to directly remove from 3.19 without deprecation.

Please confirm direct removal over standard deprecation processes and I can 
take care of this.




Lori Shepherd - Kern

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263


From: Bioc-devel  on behalf of Charles Warden 
via Bioc-devel 
Sent: Monday, April 15, 2024 7:14 PM
To: bioc-devel@r-project.org 
Cc: Xiwei Wu ; Yate-Ching Yuan 
Subject: [Bioc-devel] Request to Retire COHCAP Bioconductor Package

Hi,

I have greatly appreciated the assistance provided by Bioconductor support over 
several years!

However, my requested end date to work in the Integrative Genomics Core at City 
of Hope is 4/30/2024.

Later this year, I will be starting a PhD program at UC-Riverside.  When I was 
previously in graduate school at the University of Michigan, I switched to 
using my personal e-mail address to support the COHCAP Bioconductor package.  
However, we would prefer to improve the transition this time that I am going to 
graduate school.

So, after a number of discussions, we have decided to request to retire the 
COHCAP Bioconductor package 
(https://secure-web.cisco.com/18MW32Nmr7CBKJg3Ep7wXuI8t3lFqrL7diwmsQYI7dBiV2DnODHyT7MEKyQI_XbTm9SxFc-X59sAg_eJO5wjVcXo1_bK66HNtSdjt5vNojTo5hKNWi-Dl8DW2oDONNXjUjpvl9NTJOO3l5YjYsbzM3Nl2bQvljjnk3VZeY1qwvpf7cG1BJhQHY3RL7WpoFTiBYyCxK8HqpNz1dhxGBJOTBFDm0w-8leXCP7NCojJat40_hGb6YIofwnhi39okN_Clxzp0gzb8FQRap2AT3oyNiEZKelvaNxWXUmh2abcTYFawXskGOEAWZXfQv5BTmpkp/https%3A%2F%2Fwww.bioconductor.org%2Fpackages%2Frelease%2Fbioc%2Fhtml%2FCOHCAP.html).

We will instead request users to go to the source page on GitHub 
(https://secure-web.cisco.com/1j9Obf_O7HxIn__GrQM5CQS_vyKFobzQGg5-jot9RzFCjkUGpO1HaL4J4Adtt605Oq38PACHleEPSjAb5snp2drWgmWmB6FxuGzqlL9OWz2QFJxu1TKALIhHYujdQJS-2mP2a9XdUEB-_sp4ipEwnETlWLuOI9ExJulZ2LNOQ-fC7N1ZUTo29RGbD-fFRI-MXYemgl_nph8dRYz_1TOTLiIBSPeCu42J362tgxREQM3YrbFIkkjD88aM5CGs9UWNI7xNKWB2JucADbBq1n8bu9sec4G5raWiZ4r2zSF4quP3JtCRG8OpUgu_KBca9hWSQ61AXx0yfOKhXzb7mhejUog/https%3A%2F%2Fgithub.com%2Fcwarden45%2FCOHCAP),
 where we have made changes in the README file.

I have cc'd Yate-Ching Yuan (the Bioinformatics Director, at the time of 
development of the original code) as well as Xiwei Wu (the Integrative Genomics 
Core Director, where I currently work and where I was working when the code was 
first converted to be provided through Bioconductor).

Since we won't be making changes for future formatting requirements and we 
don't want to use a non-COH e-mail address as a support contact, can you please 
help with actively removing the COHCAP package from Bioconductor (before 4/30)?

Thank you very much!

Sincerely,
Charles


Charles Warden, Bioinformatics Specialist (He/Him)

Integrative Genomics Core

Shamrock Monrovia Building (655 Huntington Dr, Monrovia, CA 91016)

Room 1086

Internal Ext: 80375 | Direct: (626) 218-0375


--

-SECURITY/CONFIDENTIALITY WARNING-

This message and any attachments are intended solely for the individual or 
entity to which they are addressed. This communication may contain information 
that is privileged, confidential, or exempt from disclosure under applicable 
law (e.g., personal health information, research data, financial information). 
Because this e-mail has been sent without encryption, individuals other than 
the intended recipient may be able to view the information, forward it to 
others or tamper with the information without the knowledge or consent of the 
sender. If you are not the intended recipient, or the employee or person 
responsible for delivering the message to the intended recipient, any 
dissemination, distribution or copying of the communication is strictly 
prohibited. If you received the communication in error, please notify the 
sender immediately by replying to this message and deleting the message and any 
accompanying files from your system. If, due to the security risks, you do not 
wish to rec
 eive further communications via e-mail, please reply to this message and 
inform the sender that you do not wish to receive further e-mail from the 
sender. (LCP301)


[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list

Re: [Rd] read.csv

2024-04-16 Thread Dirk Eddelbuettel


On 16 April 2024 at 10:46, jing hua zhao wrote:
| Dear R-developers,
| 
| I came to a somewhat unexpected behaviour of read.csv() which is trivial but 
worthwhile to note -- my data involves a protein named "1433E" but to save 
space I drop the quote so it becomes,
| 
| Gene,SNP,prot,log10p
| YWHAE,13:62129097_C_T,1433E,7.35
| YWHAE,4:72617557_T_TA,1433E,7.73
| 
| Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly 
confused by scientific notation) numeric 1433 which only alerts me when I tried 
to combine data,
| 
| all_data <- data.frame()
| for (protein in proteins[1:7])
| {
|cat(protein,":\n")
|f <- paste0(protein,".csv")
|if(file.exists(f))
|{
|  p <- read.csv(f)
|  print(p)
|  if(nrow(p)>0) all_data  <- bind_rows(all_data,p)
|}
| }
| 
| proteins[1:7]
| [1] "1433B" "1433E" "1433F" "1433G" "1433S" "1433T" "1433Z"
| 
| dplyr::bind_rows() failed to work due to incompatible types nevertheless 
rbind() went ahead without warnings.

You may need to reconsider aiding read.csv() (and alternate reading
functions) by supplying column-type info instead of relying on educated
heuristic guesses which appear to fail here due to the nature of your data.

Other storage formats can store type info. That is generally safer and may be
an option too.

I think this was more of an email for r-help than r-devel.

Dirk

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] read.csv

2024-04-16 Thread jing hua zhao
Dear R-developers,

I came to a somewhat unexpected behaviour of read.csv() which is trivial but 
worthwhile to note -- my data involves a protein named "1433E" but to save 
space I drop the quote so it becomes,

Gene,SNP,prot,log10p
YWHAE,13:62129097_C_T,1433E,7.35
YWHAE,4:72617557_T_TA,1433E,7.73

Both read.cv() and readr::read_csv() consider prot(ein) name as (possibly 
confused by scientific notation) numeric 1433 which only alerts me when I tried 
to combine data,

all_data <- data.frame()
for (protein in proteins[1:7])
{
   cat(protein,":\n")
   f <- paste0(protein,".csv")
   if(file.exists(f))
   {
 p <- read.csv(f)
 print(p)
 if(nrow(p)>0) all_data  <- bind_rows(all_data,p)
   }
}

proteins[1:7]
[1] "1433B" "1433E" "1433F" "1433G" "1433S" "1433T" "1433Z"

dplyr::bind_rows() failed to work due to incompatible types nevertheless 
rbind() went ahead without warnings.

Best wishes,


Jing Hua

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel