[email protected] writes:

> Hello Deepayan,
>
> hmmm... I get
>
>> str(head(as.list(alignedLocs)))
> List of 6
>  $ 0:0:1  :List of 2
>   ..$ -: int(0) 
>   ..$ +: int(0) 

these are reads for which no unique alignment was found. You'll want
to filter them, e.g.,

filt <- compose(chromosomeFilter("chr[0-9XY]+.fa"),
                alignDataFilter(expression(filtering == "Y")))
aln <- readAligned(<...>, filter=filt)

or just

aln[filt(aln)]

Martin


>  $ 0:0:10 :List of 2
>   ..$ -: int(0) 
>   ..$ +: int(0) 
>  $ 0:0:100:List of 2
>   ..$ -: int(0) 
>   ..$ +: int(0) 
>  $ 0:0:101:List of 2
>   ..$ -: int(0) 
>   ..$ +: int(0) 
>  $ 0:0:102:List of 2
>   ..$ -: int(0) 
>   ..$ +: int(0) 
>  $ 0:0:103:List of 2
>   ..$ -: int(0) 
>   ..$ +: int(0)
>
>
>
> -Ivan
>
>
>
> ----- Original Message ----
> From: Deepayan Sarkar <[email protected]>
> To: [email protected]
> Cc: Michael Lawrence <[email protected]>; [email protected]
> Sent: Monday, 20 April, 2009 16:50:46
> Subject: Re: [Bioc-sig-seq] A myriad of classes
>
> On Mon, Apr 20, 2009 at 1:27 PM,  <[email protected]> wrote:
>> Hello Michael,
>>
>> A high level vignette with the infrastructure of the BioC would be great.
>>
>> Also, I can be more specific about a class problem I am facing. It concerns 
>> a developmental package that I am privileged to be allowed to test. It's 
>> chipseq.
>>
>> I am trying to follow a typical workflow guide as shown here:
>>
>> http://www.bioconductor.org/workshops/2009/SeattleJan09/ChIP-seq/ChipSeqWorkflow.pdf
>>
>> As you can see, the data that the package uses is not raw data but data that 
>> has been read in and labelled somehow beforehand. The document shows
>>
>> load("../data/alignedLocs.rda")
>>
>> That is not the scenario a user will find. A user will have one or several 
>> s_X_export.txt files.
>
> Right. The document refers to data that was provided to those who
> attended the course (which unfortunately we cannot yet make public).
>
>> So, my attempts to get my data read in in the simplest case is this
>>
>>> library(chipseq)
>>> library(lattice)
>>> setwd('/scratch1/igregore/ChIPseq/runs/09-04-10/GERALD_14-04-2009_niddk/')
>>> pattern <- "s_1_export.txt"
>>> alignedLocs <- as(readAligned(".",
>> +                               pattern,
>> +                               "SolexaExport",
>> +                               filter = 
>> alignDataFilter(expression(filtering == "Y"))),
>> +                   "GenomeData")
>>> class(alignedLocs)
>> [1] "GenomeData"
>> attr(,"package")
>> [1] "BSgenome"
>
> Perfect.
>
>> The guide says that alignedLocs should be a GenomeDataList class object but 
>> it shows up as class GenomeData. The guide also shows
>
> That's because 'alignedLocs' contained several such objects,
> representing the data obtained from multiple lanes (possibly across
> multiple runs). To create such an object, you can do
>
> alignedLocs <- GenomeDataList(list(a = alignedLocs1, b = alignedLocs2))
>
> etc. where alignedLocs1, alignedLocs2, etc. are "GenomeData" objects
> (from individual calls to readAligned).
>
>>> alignedLocs
>>  A GenomeDataList instance of length 3
>>
>> but when I try it as is I get:
>>
>>>  alignedLocs
>>   A GenomeData instance of length 51154
>
> That is a bit odd though. Individual lanes should consist of
> chromosome-level data, and this suggests that you have 51154
> chromosomes. Perhaps you can give us the output of
>
> str(head(as.list(alignedLocs)))
>
> -Deepayan
>
>
>
>
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to