Hi Thomas,

Thanks for the quick fix.

Sonali.

On 10/25/2015 1:06 PM, Thomas Girke wrote:
I fixed this in systemPipeR versions 1.4.3/1.5.3. The reason for this error
was that the tx_type column contains only NA values when a txdb is generated 
with
makeTxDbFromUCSC(). Returning here something more meaningful may be useful,
such as the transcript type information available when a txdb is generated
from a GFF.

Thanks,

Thomas

On Fri, Oct 23, 2015 at 12:49:09AM +0000, Thomas Girke wrote:
Thanks. Good to know. I have never tried this with an txdb instance
from makeTxDbFromUCSC(). Will fix this over the weekend.
Thomas



On Thu, Oct 22, 2015 at 5:39 PM Arora, Sonali <sar...@fredhutch.org> wrote:


Hi Thomas,

I get the following error when I try to obtain the feature types using
the function genFeatures()


library(systemPipeR)
library(GenomicFeatures)
Loading required package: AnnotationDbi
txdb <- makeTxDbFromUCSC(genome = "hg19", tablename = "refGene")
Download the refGene table ... OK
Download the refLink table ... OK
Extract the 'transcripts' data frame ... OK
Extract the 'splicings' data frame ... OK
Download and preprocess the 'chrominfo' data frame ... OK
Prepare the 'metadata' data frame ... OK
Make the TxDb object ... OK
Warning message:
In .extractCdsLocsFromUCSCTxTable(ucsc_txtable, exon_locs) :
UCSC data anomaly in 359 transcript(s): the cds cumulative length is
not a multiple of 3 for transcripts 'NM_001037501' 'NM_001277444'
'NM_001037675' 'NM_001271872' 'NM_001170637' 'NM_001300952'
'NM_015326' 'NM_017940' 'NM_001271870' 'NM_001143962' 'NM_001305275'
'NM_001146344' 'NM_001300891' 'NM_001010890' 'NM_001300891'
'NM_001289974' 'NM_001291281' 'NM_001301371' 'NM_016178'
'NM_001134939' 'NM_001080427' 'NM_001145710' 'NM_001291328'
'NM_001271466' 'NM_001017915' 'NM_005541' 'NM_000348' 'NM_001145051'
'NM_001135649' 'NM_001128929' 'NM_001080423' 'NM_001144382'
'NM_001291661' 'NM_002958' 'NM_001005861' 'NM_004636' 'NM_001005914'
'NM_001290060' 'NM_001290061' 'NM_001289930' 'NM_003715'
'NM_001290049' 'NM_001286054' 'NM_001286053' 'NM_001286052'
'NM_182524' 'NM_001075' 'NM_00 [... truncated]
feat <- genFeatures(txdb, featuretype="all", reduce_ranges=TRUE,
upstream=1000,
+ downstream=0, verbose=TRUE)
Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) :
subscript contains NAs


probably because -

Browse[2]> tx
GRanges object with 54439 ranges and 3 metadata columns:
seqnames ranges strand | tx_name
<Rle> <IRanges> <Rle> | <character>
[1] chr1 [11874, 14409] + | NR_046018
[2] chr1 [30366, 30503] + | NR_036051
[3] chr1 [30366, 30503] + | NR_036266
[4] chr1 [30366, 30503] + | NR_036267
[5] chr1 [30366, 30503] + | NR_036268
... ... ... ... ... ...
[54435] chrUn_gl000228 [112605, 114676] + | NM_001306068
[54436] chrUn_gl000228 [ 29339, 32226] - | NM_001005217
[54437] chrUn_gl000228 [ 29339, 32226] - | NM_001286820
[54438] chrUn_gl000241 [ 14739, 36767] - | NR_132315
[54439] chrUn_gl000241 [ 16025, 36957] - | NR_132320
gene_id tx_type
<CharacterList> <character>
[1] 100287102 <NA>
[2] 100302278 <NA>
[3] 100422831 <NA>
[4] 100422834 <NA>
[5] 100422919 <NA>
... ... ...
[54435] 100288687 <NA>
[54436] 448831 <NA>
[54437] 448831 <NA>
[54438] 100289097 <NA>
[54439] 102723780 <NA>
-------
seqinfo: 93 sequences (1 circular) from hg19 genome
Browse[2]> unique(mcols(tx)$tx_type)
[1] NA
debug: tmp <- tx[mcols(tx)$tx_type == tx_type[i]]
Browse[2]>
Error in NSBS(i, x, exact = exact, upperBoundIsStrict = !allow.append) :
subscript contains NAs


Here is my sessionInfo

sessionInfo()
R Under development (unstable) (2015-10-15 r69519)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.2 LTS

locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel stats4 stats graphics grDevices utils datasets
[8] methods base

other attached packages:
[1] GenomicFeatures_1.23.3 AnnotationDbi_1.33.0
[3] systemPipeR_1.5.1 RSQLite_1.0.0
[5] DBI_0.3.1 ShortRead_1.25.10
[7] GenomicAlignments_1.7.1 SummarizedExperiment_1.1.0
[9] Biobase_2.31.0 BiocParallel_1.5.0
[11] Rsamtools_1.23.0 Biostrings_2.39.0
[13] XVector_0.11.0 GenomicRanges_1.21.32
[15] GenomeInfoDb_1.7.1 IRanges_2.5.3
[17] S4Vectors_0.9.5 BiocGenerics_0.17.0

loaded via a namespace (and not attached):
[1] Rcpp_0.12.1 lattice_0.20-33 GO.db_3.2.2
[4] digest_0.6.8 plyr_1.8.3 futile.options_1.0.0
[7] BatchJobs_1.6 ggplot2_1.0.1 zlibbioc_1.17.0
[10] annotate_1.49.0 Matrix_1.2-2 checkmate_1.6.2
[13] proto_0.3-10 GOstats_2.37.0 splines_3.3.0
[16] stringr_1.0.0 pheatmap_1.0.7 RCurl_1.95-4.7
[19] biomaRt_2.27.0 munsell_0.4.2 sendmailR_1.2-1
[22] rtracklayer_1.31.1 base64enc_0.1-3 BBmisc_1.9
[25] fail_1.3 edgeR_3.13.0 XML_3.98-1.3
[28] AnnotationForge_1.13.0 MASS_7.3-44 bitops_1.0-6
[31] grid_3.3.0 RBGL_1.47.0 xtable_1.7-4
[34] GSEABase_1.33.0 gtable_0.1.2 magrittr_1.5
[37] scales_0.3.0 graph_1.49.1 stringi_1.0-1
[40] hwriter_1.3.2 reshape2_1.4.1 genefilter_1.53.0
[43] limma_3.27.0 latticeExtra_0.6-26 futile.logger_1.4.1
[46] brew_1.0-6 rjson_0.2.15 lambda.r_1.1.7
[49] RColorBrewer_1.1-2 tools_3.3.0 Category_2.37.0
[52] survival_2.38-3 colorspace_1.2-6




--
Thanks and Regards,
Sonali




_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to