Hi Ed, Awesome! it works! Thanks so much for all your help, I really appreciate it.
For reference in case anybody else has the same issue and reads this exchange, my gff3 file now looks like this: ##gff-version 3 MRGH6_contig ltr_finder:def match 55509 64198 6 - . Source=ltr_finder:def; Strand=-; ID=ltr_finder:def_ltr_finder_1_LTR_retrotransposon; name=LTR_retrotransposon MRGH6_contig ltr_finder:def match_part 62841 62857 6 - . Source=ltr_finder:def; Strand=-; Parent=ltr_finder:def_ltr_finder_1_LTR_retrotransposon; ID=ltr_finder:def_ltr_finder_1_primer_binding_site; name=primer_binding_site; MRGH6_contig ltr_finder:def match_part 56801 56815 6 - . Source=ltr_finder:def; Strand=-; Parent=ltr_finder:def_ltr_finder_1_LTR_retrotransposon; ID=ltr_finder:def_ltr_finder_1_RR_tract; name=RR_tract; MRGH6_contig ltr_finder:def match_part 55509 56766 6 - . Source=ltr_finder:def; Strand=-; Parent=ltr_finder:def_ltr_finder_1_LTR_retrotransposon; ID=ltr_finder:def_ltr_finder_1_five_prime_LTR; name=five_prime_LTR; MRGH6_contig ltr_finder:def match_part 62941 64198 6 - . Source=ltr_finder:def; Strand=-; Parent=ltr_finder:def_ltr_finder_1_LTR_retrotransposon; ID=ltr_finder:def_ltr_finder_1_three_prime_LTR; name=three_prime_LTR; note that the type of the parent element is match, and the type of the children elements is match_part. Also, I changed the "Name" tag to "name", it seems like the capitalized one is a reserved field because the column would show up, but there was no value in it. Cheers, Elizabeth On Thu, May 13, 2010 at 6:02 PM, Ed Lee <[email protected]> wrote: > Hi Elizabeth, > > If you encode the transposon as match/match_part, they'll show up in > the results panel (the black panel). To enconde the grouping, you'll > want to do something like this: > > <gff3> > > ##gff-version 3 > > MRGH6_contig ltr_finder:def transposable_element 55509 64198 6 > - . Name=transposable_element_1;ID=transposable_element_1 > MRGH6_contig ltr_finder:def three_prime_LTR 62941 64198 6 - > . > Name=three_prime_LTR;Source=ltr_finder:def;Strand=-;ID=three_prime_LTR_1;Parent=transposable_element_1 > MRGH6_contig ltr_finder:def primer_binding_site 62841 62857 6 > - . > Name=primer_binding_site;Source=ltr_finder:def;Strand=-;ID=primer_binding_site_1;Parent=transposable_element_1;Parent=transposable_element_1 > MRGH6_contig ltr_finder:def RR_tract 56801 56815 6 - > . > Name=RR_tract;Source=ltr_finder:def;Strand=-;ID=RR_tract_1;Parent=transposable_element_1 > MRGH6_contig ltr_finder:def five_prime_LTR 55509 56766 6 - > . > Name=five_prime_LTR;Source=ltr_finder:def;Strand=-;ID=five_prime_LTR_1;Parent=transposable_element_1 > > </gff3> > > Note that every feature has a different ID and the child features link > back to the transposon using the "Parent" attribute. If you add > "column : ID" to your tiers file for the appropriate type, you'll get > the ID added to the list of attributes so that you can figure out > what each child represents. > > Cheers, > Ed > > On Wed, 12 May 2010, elizabeth henaff wrote: > > Hi Ed, >> I'm not quite sure how to define the groupings if I convert the types >> to >> match or match_part. It seems that if I use match_part it groups the >> features by the source (2nd) column. And I also tried assigning the same >> ID >> tag (in the 9th column) to each set of features that I'd like to be >> connected, but in that case they are still grouped by source, with only >> the >> first element that has a given ID showing up. >> Also, where is the type definition for match? I didn't find it in the >> gff3.tiers file. >> I've also tried renaming the types as transcript for the >> retrotransposons, and exons for the transposon parts, while preserving the >> original SO types in the Name tag. This seems to fit my needs in that I >> can >> use the Parent / Child relationships, without losing the information as to >> the SO type. >> If I can't find a way to display them as discontinuous features in the >> results section, I might stick with that work-around. >> >> Thanks for any ideas! >> >> Elizabeth >> >> On Wed, May 12, 2010 at 8:29 PM, Ed Lee <[email protected]> wrote: >> Hi Elizabeth, >> >> Are you trying to view these features in the annotation or >> results >> panel? For annotations, Apollo doesn't support multi-tier >> features >> other than gene-transcript-exon (and its variants). With the >> current implementation, you will not be able to group the >> children >> features together (and keep their correct SO type). If you want >> to view them in the results panel, you can just convert the >> types to >> match/match_part and you can get boxes connected by lines for >> grouping the individual features together. >> >> >> Cheers, >> Ed >> >> On Wed, 12 May 2010, elizabeth henaff wrote: >> >> Hi Ed, >> Thanks for your response. >> >> The data I have is such that I had one Parent >> (retrotransposon) with >> multiple children features (LTR, primer binding site, >> etc). It is meaningful >> for me to retain these children features, so I was >> wondering of it would be >> possible to circumvent the child-parent relationship by >> eliminating the >> Parent feature, and representing all its children as one >> discontinuous >> feature. >> I've tried to do that by adding an identical ID attribute >> to the children >> that belong together and "groupby : GENE" to the [type] >> definition but when >> I load the gff3 I only see the first feature with a given >> ID. >> >> this is what I have added to gff3.tiers: >> >> [Tier] >> tiername : LTR_de_novo >> expanded : true >> labeled : true >> curated : true >> >> [Type] >> label : LTR_retrotransposon >> tiername : LTR_de_novo >> datatype : primer_binding_site >> datatype : RR_tract >> datatype : five_prime_LTR >> datatype : three_prime_LTR >> datatype : target_site_duplication >> datatype : transposon_fragment >> glyph : ThinRectangle >> color : red >> utr_color : 250,250,210 >> column : GENOMIC_LENGTH 1 >> column : GENOMIC_RANGE 1 >> column : SCORE >> column : NAME >> column : Source >> column : Strand >> >> and here's a snippet of the gff3 file: >> >> ##gff-version 3 >> MRGH6_contig ltr_finder:def three_prime_LTR 62941 64198 6 >> - . >> Name=three_prime_LTR; Source=ltr_finder:def; Strand=-; >> ID=ltr_finder:def_ltr_finder_1 >> MRGH6_contig ltr_finder:def primer_binding_site 62841 >> 62857 6 - . >> Name=primer_binding_site; Source=ltr_finder:def; Strand=-; >> ID=ltr_finder:def_ltr_finder_1 >> MRGH6_contig ltr_finder:def RR_tract 56801 56815 6 - . >> Name=RR_tract; >> Source=ltr_finder:def; Strand=-; >> ID=ltr_finder:def_ltr_finder_1 >> MRGH6_contig ltr_finder:def five_prime_LTR 55509 56766 6 - >> . >> Name=five_prime_LTR; Source=ltr_finder:def; Strand=-; >> ID=ltr_finder:def_ltr_finder_1 >> MRGH6_contig ltr_finder:def primer_binding_site 105081 >> 105103 6 - . >> Name=primer_binding_site; Source=ltr_finder:def; Strand=-; >> ID=ltr_finder:def_ltr_finder_2 >> MRGH6_contig ltr_finder:def RR_tract 95013 95027 6 - . >> Name=RR_tract; >> Source=ltr_finder:def; Strand=-; >> ID=ltr_finder:def_ltr_finder_2 >> MRGH6_contig ltr_finder:def five_prime_LTR 94168 95008 6 - >> . >> Name=five_prime_LTR; Source=ltr_finder:def; Strand=-; >> ID=ltr_finder:def_ltr_finder_2 >> MRGH6_contig ltr_finder:def three_prime_LTR 105170 106020 >> 6 - . >> Name=three_prime_LTR; Source=ltr_finder:def; Strand=-; >> ID=ltr_finder:def_ltr_finder_2 >> >> >> Thanks for your help! >> >> >> >> Elizabeth >> >> On Wed, May 12, 2010 at 6:15 PM, Ed Lee >> <[email protected]> wrote: >> Hi Elizabeth, >> >> The only parent-child relationships that Apollo >> currently >> supports >> is the gene-transcript-exon relationship. Unless you >> need to >> retain >> the LTR relationship, I would suggest choosing the >> feature that >> reflects the analysis best (either the parent or >> child) and then >> removing the other (if removing the child, remember >> to remove >> the >> "Parent" attribute in the GFF3 file). >> >> Cheers, >> Ed >> >> >> On Wed, 12 May 2010, elizabeth henaff wrote: >> >> Hello all, >> >> I'm trying to use Apollo to annotate transposable >> elements >> in some BAC >> sequences. I have run various prediction programs >> that >> generate data in GFF3 >> format, and my idea is to load these results into >> Apollo >> to be able to >> manually curate them. >> >> >> The nature of the data is such that there are >> Parent-Child >> relationships >> between features (as defined by the Parent=xxx tag in >> the >> 9th column), for >> example a three_prime_LTR feature would have as a >> Parent a >> LTR_retrotransposon feature, indicating that that LTR >> sequence belongs to a >> particular retrotransposon. >> When I load the gff3 file into Apollo, I get the >> following error message: >> "only transcripts and exons can be children of >> annotations: >> three_prime_LTR[three_prime_LTR]" >> and a similar error message for each feature that has >> a >> Parent tag. >> Apollo then starts up, but only the features that do >> not >> have a Parent tag >> appear. >> >> In gff3.tiers I have defined a new [tier] as >> TE_annotation >> and a [type] for >> each of the datatypes (the second column of the gff3 >> file) >> present in my >> gff3 file. For each of the [type] fields I defined >> the >> number_of_levels >> field as 3, but that doesn't seem to fix the >> problem. >> Any idea how I can resolve this? >> >> Thanks so much for any help! >> >> I've tried to attach the gff3, fasta and gff3.tiers >> files >> I'm using but it >> seems like it makes the message too big, but I'd be >> happy >> to send them if >> that would help. >> I'm running version 1.11.2 on ubuntu. >> >> Cheers, >> >> Elizabeth >> >> Centre de Recerca en AgriGenomica >> Barcelona, Espanya >> >> >> >> >> >>
_______________________________________________ apollo mailing list [email protected] http://mail.fruitfly.org/mailman/listinfo/apollo
