from:"David Matthews"

[galaxy-user] moving files between galaxy servers

2011-01-10 Thread David Matthews

Hi,

I am trying to move my histories from the galaxy server at psu to the one at 
the Ratsch lab (http://galaxy.tuebingen.mpg.de) but all my attempts seem to be 
taking forever. I have tried moving the whole set or individual parts of it but 
with no success. From the Ratsch lab server I get the following message: The 
resource could not be found. 
File Not Found 
(/home/galaxy/galaxy-2.1.2009/database/files/064/dataset_64040.dat).

Any ideas?

Best Wishes,
David

___
galaxy-user mailing list
galaxy-user@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-user

[galaxy-user] Stalled cuffdiff run

2011-01-21 Thread David Matthews

Hi Jeremy,

I have a stalled cufflinks run - its been queued all day - any idea why its 
stalled?

Cheers
David



__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058

d.a.matth...@bristol.ac.uk




___
galaxy-user mailing list
galaxy-user@lists.bx.psu.edu
http://lists.bx.psu.edu/listinfo/galaxy-user

Re: [galaxy-user] get wig file after tophat

2011-02-21 Thread David Matthews

Hi,

You can get an equivalent visualisation from the IGV viewer by the Broad 
Institute - its under IGV tools and generates a tdf file from bam or sam files. 
This also gives a quick and easy way of looking at depth at any particular site 
and is very accessible.

Cheers
David


On 21 Feb 2011, at 21:44, Jeremy Goecks wrote:

> Hi all,
> 
> Ann is correct - Tophat does not produce .wig files when run anymore. 
> However, it's fairly easy to use Galaxy to make a wiggle-like coverage file 
> from a BAM file:
> 
> (a) run the pileup tool on your BAM to create a pileup file;
> (b) cut columns 1 and 4 to get your coverage file.
> 
> A final note: it's often difficult to visualize coverage files because 
> they're so large. You might be better off visualizing the BAM file and using 
> the coverage file for statistics.
> 
> Best,
> J.
> 
>> Hello,
>> 
>> I think I know the answer (sort of) to this question.
>> 
>> This may be because newer versions of tophat stopped running the "wiggles"
>> program, which is still part of the tophat distribution and is the program
>> that makes the "coverage.wig" file.
>> 
>> A later version of tophat might bring this back, however - there's a note to
>> this effect in the tophat python code.
>> 
>> So if you can run wiggles, you can make the "coverage.wig" file on your own.
>> 
>> A student here at UNC Charlotte (Adam Baxter) made a few changes to the
>> "wiggles" source code that would allow you to use it with samtools to make a
>> "coverage.wig" file from the "accepted_hits.bam" file that TopHat creates.
>> 
>> If you (or anyone else) would like a copy, please email Adam, who is cc'ed
>> on this email.
>> 
>> We would be happy to help add it to Galaxy if this would be of interest to
>> you or other Galaxy users.
>> 
>> If there is any way we can be of assistance, please let us know!
>> 
>> Very best wishes,
>> 
>> Ann Loraine
>> 
>> 
>> On 2/21/11 3:39 PM, "Ying Zhang"  wrote:
>> 
>>> Hi:
>>> 
>>> I am using tophat in galaxy to analyze my paired-end RNA-seq data and find 
>>> out
>>> that after the tophat analysis, we can not get the wig file from it anymore
>>> which is used to be able to. Do you have any idea of how to still be able to
>>> get the wig file after tophat analysis? Thanks a lot!
>>> 
>>> Best
>>> 
>>> Ying Zhang, M.D., Ph.D.
>>> Postdoctoral Associate
>>> Department of Genetics,
>>> Yale University School of Medicine
>>> 300 Cedar Street,S320
>>> New Haven, CT 06519
>>> Tel: (203)737-2616
>>> Fax: (203)737-2286
>>> ___
>>> The Galaxy User list should be used for the discussion
>>> of Galaxy analysis and other features on the public
>>> server at usegalaxy.org. For discussion of local Galaxy
>>> instances and the Galaxy source code, please use the
>>> Galaxy Development list:
>>> 
>>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>>> 
>>> To manage your subscriptions to this and other
>>> Galaxy lists, please use the interface at:
>>> 
>>> http://lists.bx.psu.edu/
>> 
>> -- 
>> Ann Loraine
>> Associate Professor
>> Dept. of Bioinformatics and Genomics, UNCC
>> North Carolina Research Campus
>> 600 Laureate Way
>> Kannapolis, NC 28081
>> 704-250-5750
>> www.transvar.org
>> 
>> 
>> ___
>> The Galaxy User list should be used for the discussion
>> of Galaxy analysis and other features on the public
>> server at usegalaxy.org. For discussion of local Galaxy
>> instances and the Galaxy source code, please use the
>> Galaxy Development list:
>> 
>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> To manage your subscriptions to this and other
>> Galaxy lists, please use the interface at:
>> 
>> http://lists.bx.psu.edu/
> 
> 
> ___
> The Galaxy User list should be used for the discussion
> of Galaxy analysis and other features on the public
> server at usegalaxy.org. For discussion of local Galaxy
> instances and the Galaxy source code, please use the
> Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other
> Galaxy lists, please use the interface at:
> 
>  http://lists.bx.psu.edu/


___
The Galaxy User list should be used for the discussion
of Galaxy analysis and other features on the public
server at usegalaxy.org. For discussion of local Galaxy
instances and the Galaxy source code, please use the
Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other
Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Installing Galaxy locally

2011-02-21 Thread David Matthews

Hi,

I am not a computer person and very much like using Galaxy because its nice and 
easy for non-bioinformaticians. Tomorrow I am going to have a meeting with the 
computer science department here at Bristol and I am hoping to persuade them to 
install Galaxy within the High Performance Computing Centre for University of 
Bristol users. As I understand it this is relatively straight forward - that is 
to say the project was designed for them to install the whole thing locally and 
reproduce the set up here so I and others can use it here instead of clogging 
up your machines with our data and requests (!). Is this right? There are no 
fees or anything and this is something most computer centres should be able to 
do? I know this may seem a silly question but it seems prudent to ask!

Cheers
David
 
___
The Galaxy User list should be used for the discussion
of Galaxy analysis and other features on the public
server at usegalaxy.org. For discussion of local Galaxy
instances and the Galaxy source code, please use the
Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other
Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] get wig file after tophat

2011-02-22 Thread David Matthews

HI,

The option you need in IGV tools is "count". You set a window size and this 
gives you a tdf file from your sorted bam (or sam) file which is nice and quick 
to view on IGV.


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 22 Feb 2011, at 15:52, Ying Zhang wrote:

> Dear David:
> 
> thank you very much for helping me!
> 
> I have download the IGV and I do find the IGVtools, however, I am not sure 
> which
> tool I should use for generate a tdf file, the tile function will generate a 
> tdf
> file, but the input file format does not include bam or sam file, instead it
> need wig file. But I have no wig file to put in. So I am wondering whether you
> need to use other tool first. I really appreciate your help! Thank you very
> much!
> 
> Best
> 
> Ying
> 
> Quoting David Matthews :
> 
>> Hi,
>> 
>> You can get an equivalent visualisation from the IGV viewer by the Broad 
>> Institute - its under IGV tools and generates a tdf file from bam or sam 
>> files. This also gives a quick and easy way of looking at depth at any 
>> particular site and is very accessible.
>> 
>> Cheers
>> David
>> 
>> 
>> On 21 Feb 2011, at 21:44, Jeremy Goecks wrote:
>> 
>>> Hi all,
>>> 
>>> Ann is correct - Tophat does not produce .wig files when run anymore. 
>>> However, it's fairly easy to use Galaxy to make a wiggle-like coverage file 
>>> from a BAM file:
>>> 
>>> (a) run the pileup tool on your BAM to create a pileup file;
>>> (b) cut columns 1 and 4 to get your coverage file.
>>> 
>>> A final note: it's often difficult to visualize coverage files because 
>>> they're so large. You might be better off visualizing the BAM file and 
>>> using the coverage file for statistics.
>>> 
>>> Best,
>>> J.
>>> 
>>>> Hello,
>>>> 
>>>> I think I know the answer (sort of) to this question.
>>>> 
>>>> This may be because newer versions of tophat stopped running the "wiggles"
>>>> program, which is still part of the tophat distribution and is the program
>>>> that makes the "coverage.wig" file.
>>>> 
>>>> A later version of tophat might bring this back, however - there's a note 
>>>> to
>>>> this effect in the tophat python code.
>>>> 
>>>> So if you can run wiggles, you can make the "coverage.wig" file on your 
>>>> own.
>>>> 
>>>> A student here at UNC Charlotte (Adam Baxter) made a few changes to the
>>>> "wiggles" source code that would allow you to use it with samtools to make 
>>>> a
>>>> "coverage.wig" file from the "accepted_hits.bam" file that TopHat creates.
>>>> 
>>>> If you (or anyone else) would like a copy, please email Adam, who is cc'ed
>>>> on this email.
>>>> 
>>>> We would be happy to help add it to Galaxy if this would be of interest to
>>>> you or other Galaxy users.
>>>> 
>>>> If there is any way we can be of assistance, please let us know!
>>>> 
>>>> Very best wishes,
>>>> 
>>>> Ann Loraine
>>>> 
>>>> 
>>>> On 2/21/11 3:39 PM, "Ying Zhang"  wrote:
>>>> 
>>>>> Hi:
>>>>> 
>>>>> I am using tophat in galaxy to analyze my paired-end RNA-seq data and 
>>>>> find out
>>>>> that after the tophat analysis, we can not get the wig file from it 
>>>>> anymore
>>>>> which is used to be able to. Do you have any idea of how to still be able 
>>>>> to
>>>>> get the wig file after tophat analysis? Thanks a lot!
>>>>> 
>>>>> Best
>>>>> 
>>>>> Ying Zhang, M.D., Ph.D.
>>>>> Postdoctoral Associate
>>>>> Department of Genetics,
>>>>> Yale University School of Medicine
>>>>> 300 Cedar Street,S320
>>>>> New Haven, CT 06519
>>>>> Tel: (203)737-2616
>>>>> Fax: (203)737-2286
>>>>> ___
>>>>> The Ga

[galaxy-user] RNA seq analysis

2011-02-23 Thread David Matthews

Hi Jeremy,

I thought I'd write to get a discussion of a workflow for people doing RNA seq 
that I have found very useful and addresses some issues in mapping mRNA derived 
RNA-seq paired end data to the genome using tophat. Here is the approach I use 
(I have a human mRNA sample deep sequenced with a 56bp paired end read on an 
illumina generating 29 million reads):

1. Align to hg19 (in my case) using tophat and allowing up to 40 hits for each 
sequence read
2. In samtools filter for "read is unmapped", "mate is mapped" and "mate is 
mapped in a proper pair"
3. Use "group" to group the filtered sam file on c1 (which is the 
"bio-sequencer" read number) and set an operation to count on c1 as well. This 
provides a list of the reads and how many times they map to the human genome, 
because you have filtered the set for reads that have a mate pair there will be 
an even number for each read. For most of the reads the number will be 2 
(indicating the forward read maps once and the reverse read maps once and in a 
proper pair) but for reads that map ambiguously the number will be multiples of 
2. If you count these up I find that 18 million reads map once, 1.3 million map 
twice, 400,000 reads map 3 times and so on until you get down to 1 read mapping 
30 times, 1 read mapping 31 times and so on...
4. Filter the reads to remove any reads that map more than 2 times.
5. Use "compare two datasets" to compare your new list of reads that map only 
twice to pull out all the reads in your sam file that only map twice (i.e. the 
mate pairs).
6. You'll need to sort the sam file before you can use it with other 
applications like IGV.

What you end up with is a sam file where all the reads map to one site only and 
all the reads map as a proper pair. This may seem similar to setting tophat to 
ignore non-unique reads. However, it is not. This approach gives you 10-15% 
more reads. I think it is because if tophat finds (for example) that the 
forward read maps to one site but the reverse read maps to two sites it throws 
away the whole read. By filtering the sam file to restrict it to only those 
mappings that make sense you increase the number of unique reads by getting rid 
of irrational mappings.

Has anyone else found this? Does this make sense to anyone else? Am I making a 
huge mistake somewhere?

A nice aspect of this (or at least I think so!) is that by filtering in this 
manner you can also create a sam file of non-unique mappings which you can 
monitor. This can be useful if one or more genes has a problem of generating a 
lot of non-unique maps which may give problems accurately estimating its 
expression. Also, you also get a list of how many multi hits you have in your 
data so you know the scale of the problem.

Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk




___
The Galaxy User list should be used for the discussion
of Galaxy analysis and other features on the public
server at usegalaxy.org. For discussion of local Galaxy
instances and the Galaxy source code, please use the
Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other
Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] RNA seq analysis

2011-02-23 Thread David Matthews

Hi all,

Further to my last email, I've published a workflow (Bristol workflow ) 
which does what I described below - hope this helps in understanding what I'm 
on about (!).

Best Wishes,
David.



On 23 Feb 2011, at 14:41, David Matthews wrote:

> Hi Jeremy,
> 
> I thought I'd write to get a discussion of a workflow for people doing RNA 
> seq that I have found very useful and addresses some issues in mapping mRNA 
> derived RNA-seq paired end data to the genome using tophat. Here is the 
> approach I use (I have a human mRNA sample deep sequenced with a 56bp paired 
> end read on an illumina generating 29 million reads):
> 
> 1. Align to hg19 (in my case) using tophat and allowing up to 40 hits for 
> each sequence read
> 2. In samtools filter for "read is unmapped", "mate is mapped" and "mate is 
> mapped in a proper pair"
> 3. Use "group" to group the filtered sam file on c1 (which is the 
> "bio-sequencer" read number) and set an operation to count on c1 as well. 
> This provides a list of the reads and how many times they map to the human 
> genome, because you have filtered the set for reads that have a mate pair 
> there will be an even number for each read. For most of the reads the number 
> will be 2 (indicating the forward read maps once and the reverse read maps 
> once and in a proper pair) but for reads that map ambiguously the number will 
> be multiples of 2. If you count these up I find that 18 million reads map 
> once, 1.3 million map twice, 400,000 reads map 3 times and so on until you 
> get down to 1 read mapping 30 times, 1 read mapping 31 times and so on...
> 4. Filter the reads to remove any reads that map more than 2 times.
> 5. Use "compare two datasets" to compare your new list of reads that map only 
> twice to pull out all the reads in your sam file that only map twice (i.e. 
> the mate pairs).
> 6. You'll need to sort the sam file before you can use it with other 
> applications like IGV.
> 
> What you end up with is a sam file where all the reads map to one site only 
> and all the reads map as a proper pair. This may seem similar to setting 
> tophat to ignore non-unique reads. However, it is not. This approach gives 
> you 10-15% more reads. I think it is because if tophat finds (for example) 
> that the forward read maps to one site but the reverse read maps to two sites 
> it throws away the whole read. By filtering the sam file to restrict it to 
> only those mappings that make sense you increase the number of unique reads 
> by getting rid of irrational mappings.
> 
> Has anyone else found this? Does this make sense to anyone else? Am I making 
> a huge mistake somewhere?
> 
> A nice aspect of this (or at least I think so!) is that by filtering in this 
> manner you can also create a sam file of non-unique mappings which you can 
> monitor. This can be useful if one or more genes has a problem of generating 
> a lot of non-unique maps which may give problems accurately estimating its 
> expression. Also, you also get a list of how many multi hits you have in your 
> data so you know the scale of the problem.
> 
> Best Wishes,
> David.
> 
> __
> Dr David A. Matthews
> 
> Senior Lecturer in Virology
> Room E49
> Department of Cellular and Molecular Medicine,
> School of Medical Sciences
> University Walk,
> University of Bristol
> Bristol.
> BS8 1TD
> U.K.
> 
> Tel. +44 117 3312058
> Fax. +44 117 3312091
> 
> d.a.matth...@bristol.ac.uk
> 
> 
> 
> 
> ___
> The Galaxy User list should be used for the discussion
> of Galaxy analysis and other features on the public
> server at usegalaxy.org. For discussion of local Galaxy
> instances and the Galaxy source code, please use the
> Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other
> Galaxy lists, please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion
of Galaxy analysis and other features on the public
server at usegalaxy.org. For discussion of local Galaxy
instances and the Galaxy source code, please use the
Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other
Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] RNA seq analysis

2011-02-24 Thread David Matthews

Thanks Ann for your comments and for the stuff you showed at IGB - looks very 
interesting. I agree that multihits may the equivalent of the problem you 
describe from microarrays. I think, for me anyway, knowing the scale if the 
issue is the key thing at this stage. As you imply from your email the next 
-and potentially very interesting step -   is to figure out how/where these 
multihits are and how they came to be. I guess it all comes dow to where do 
genes come from? Well, many of them come from other genes via duplications, 
transpositions etc etc!

I have made a slight alteration to this "bristol" workflow which now 
automatically creates a sorted sam file of the multihits (forgot to put it in 
1st time round!)

Cheers
David


On 24 Feb 2011, at 12:08, Ann Loraine wrote:

> 
> Hello,
> 
> I like your approach of running the alignment tools with liberal settings and 
> then filtering the results into different categories.
> 
> This discussion reminds me of how in expression microarray analysis, we face 
> uncertainty as to what molecules (exactly) are hybridizing to the probes on a 
> chip. 
> 
> Maybe the ambiguity of mapping short sequence reads introduces similar 
> uncertainty?  
> 
> I also like your idea of capturing the reads that map multiple times. 
> 
> It’s interesting to visualize the alignments for reads that map onto multiple 
> locations in a genome.
> 
> An example (from data expressed in “wiggle” format) is described here:
> 
> https://wiki.transvar.org/confluence/x/w4BJAQ
> 
> My apologies for posting another IGB citation, but I think it can be 
> interesting and informative to see the data in this way, and IGB makes it 
> easy to zoom in and out through the data and find patterns quickly.
> 
> One of the first things I noticed when I started looking at coverage graphs 
> made from multi-mapping reads is that (1) there are a lot of them and (2) 
> they expose tandemly duplicated genes.  
> 
> Here’s a link to an image that showing a particularly striking example from a 
> single-read, 75 bp RNA-Seq data set from Arabidopsis thaliana Col-0. The 
> pattern of read alignment is nearly identical between the two genes. 
> https://wiki.transvar.org/confluence/download/attachments/21594307/tandem-duplication.png
> 
> You can’t see it from the image, of course, but if I right-click one of the 
> genes, IGB links out to a Web page describing the gene at 
> www.arabidopsis.org, the main on-line database for Arabidopsis. (Human genes 
> link to NCBI.) 
> 
> -Ann
> 
> 
> On 2/23/11 11:05 PM, "Jeremy Goecks"  wrote:
> 
>> Hi David,
>> 
>> This is a really interesting workflow. My comments:
>> 
>> (1) I encourage you to start a discussion about this idea on seqanswers.com 
>> <http://seqanswers.com> ; you'll reach more people and may have a better 
>> discussion there. Ideally, you'll get a Tophat developer to chime in on what 
>> I perceive to be the main issue, which is:
>> 
>>> This may seem similar to setting tophat to ignore non-unique reads. 
>>> However, it is not. This approach gives you 10-15% more reads. I think it 
>>> is because if tophat finds (for example) that the forward read maps to one 
>>> site but the reverse read maps to two sites it throws away the whole read.
>> 
>> Remember that Tophat uses Bowtie to map reads, so it would make sense to 
>> look carefully at the Bowtie documentation to see how it handles paired-end 
>> reads. I can't find anything that directly addresses your issue. The other 
>> thing to consider is how Tophat maps reads -- it breaks them up in order to 
>> find splice junctions -- and so I'm not sure that Tophat/Bowtie is really 
>> mapping paired reads; it may be doing some hybrid single/paired-end mapping. 
>> Also, at one time, you could specify Bowtie parameters when running Tophat, 
>> but I don't see that option anymore.
>> 
>> (2) It would be interesting to know whether you get qualitatively different 
>> results via Cufflinks (or another transcriptome analysis software package) 
>> using your method vs. just using Tophat w/ and w/o ignoring non-unique 
>> reads. A skeptical view of your workflow would note that (a) multi-mapping 
>> reads may be legitimate and should not be filtered out and (b) 
>> Cufflinks/compare/diff assembly and quantitation may smooth out stray reads 
>> enough so that your method isn't necessary.
>> 
>> Thanks for the interesting post,
>> J.
>> 
>> On Feb 23, 2011, at 9:41 AM, David Matthews wrote:
>> 
>>> Hi Jeremy,
>>> 
>>> I thought I'd write to get a discussion of a wor

Re: [galaxy-user] RNA seq analysis

2011-02-24 Thread David Matthews

HI Jeremy,

Thanks for the feedback. I know what you mean about tophat not having the same 
functionality of bowtie. However, I think whatever tophat does do (now or in 
the future) I think it is useful to collect the multihits separately since 
either you leave them in and over estimate gene expression or remove them and 
underestimate gene expression. As you suggested I put this up on Seqanswers to 
see if anyone else likes/doesn't like it we'll see how it goes. I certainly 
find it handy - not least to reassure myself that when I get the gene 
expresison data I can tell if there are any "funny" reads making up the numbers!

Cheers
David

P.S. I modified the workflow to include collecting the multihits in a separate 
sorted sam file. 


On 24 Feb 2011, at 04:05, Jeremy Goecks wrote:

> Hi David,
> 
> This is a really interesting workflow. My comments:
> 
> (1) I encourage you to start a discussion about this idea on seqanswers.com; 
> you'll reach more people and may have a better discussion there. Ideally, 
> you'll get a Tophat developer to chime in on what I perceive to be the main 
> issue, which is:
> 
>> This may seem similar to setting tophat to ignore non-unique reads. However, 
>> it is not. This approach gives you 10-15% more reads. I think it is because 
>> if tophat finds (for example) that the forward read maps to one site but the 
>> reverse read maps to two sites it throws away the whole read.
> 
> Remember that Tophat uses Bowtie to map reads, so it would make sense to look 
> carefully at the Bowtie documentation to see how it handles paired-end reads. 
> I can't find anything that directly addresses your issue. The other thing to 
> consider is how Tophat maps reads -- it breaks them up in order to find 
> splice junctions -- and so I'm not sure that Tophat/Bowtie is really mapping 
> paired reads; it may be doing some hybrid single/paired-end mapping. Also, at 
> one time, you could specify Bowtie parameters when running Tophat, but I 
> don't see that option anymore.
> 
> (2) It would be interesting to know whether you get qualitatively different 
> results via Cufflinks (or another transcriptome analysis software package) 
> using your method vs. just using Tophat w/ and w/o ignoring non-unique reads. 
> A skeptical view of your workflow would note that (a) multi-mapping reads may 
> be legitimate and should not be filtered out and (b) Cufflinks/compare/diff 
> assembly and quantitation may smooth out stray reads enough so that your 
> method isn't necessary.
> 
> Thanks for the interesting post,
> J.
> 
> On Feb 23, 2011, at 9:41 AM, David Matthews wrote:
> 
>> Hi Jeremy,
>> 
>> I thought I'd write to get a discussion of a workflow for people doing RNA 
>> seq that I have found very useful and addresses some issues in mapping mRNA 
>> derived RNA-seq paired end data to the genome using tophat. Here is the 
>> approach I use (I have a human mRNA sample deep sequenced with a 56bp paired 
>> end read on an illumina generating 29 million reads):
>> 
>> 1. Align to hg19 (in my case) using tophat and allowing up to 40 hits for 
>> each sequence read
>> 2. In samtools filter for "read is unmapped", "mate is mapped" and "mate is 
>> mapped in a proper pair"
>> 3. Use "group" to group the filtered sam file on c1 (which is the 
>> "bio-sequencer" read number) and set an operation to count on c1 as well. 
>> This provides a list of the reads and how many times they map to the human 
>> genome, because you have filtered the set for reads that have a mate pair 
>> there will be an even number for each read. For most of the reads the number 
>> will be 2 (indicating the forward read maps once and the reverse read maps 
>> once and in a proper pair) but for reads that map ambiguously the number 
>> will be multiples of 2. If you count these up I find that 18 million reads 
>> map once, 1.3 million map twice, 400,000 reads map 3 times and so on until 
>> you get down to 1 read mapping 30 times, 1 read mapping 31 times and so on...
>> 4. Filter the reads to remove any reads that map more than 2 times.
>> 5. Use "compare two datasets" to compare your new list of reads that map 
>> only twice to pull out all the reads in your sam file that only map twice 
>> (i.e. the mate pairs).
>> 6. You'll need to sort the sam file before you can use it with other 
>> applications like IGV.
>> 
>> What you end up with is a sam file where all the reads map to one site only 
>> and all the reads map as a proper pair. This may seem similar to setting 
>> tophat to ignore non-unique reads.

Re: [galaxy-user] how to find out the gene_ID correspond to CUFF ID

2011-02-28 Thread David Matthews

Hi,

You need to supply a gene annotation file with cufflink to easily get the 
gene-id information. Without it, cufflinks simply tries its best to figure out 
what genes are present. The ensemble gtf file is quite a comprehensive one - 
there is a link to it on the cufflinks manual page.

Good luck!
David

On 28 Feb 2011, at 21:33, Ying Zhang wrote:

> Dear Everyone:
> 
> I have got one output file after I run Cufflink which contain gene expression
> information. However, I found out for each gene_ID, it has the format like,
> CUFF.1151175, do you have idea of how to find out the offical gene ID
> correspond to this CUFF ID? Thank you very much!
> 
> Best
> 
> Ying Zhang, M.D., Ph.D.
> Postdoctoral Associate
> Department of Genetics,
> Yale University School of Medicine
> 300 Cedar Street,S320
> New Haven, CT 06519
> Tel: (203)737-2616
> Fax: (203)737-2286
> ___
> The Galaxy User list should be used for the discussion
> of Galaxy analysis and other features on the public
> server at usegalaxy.org. For discussion of local Galaxy
> instances and the Galaxy source code, please use the
> Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other
> Galaxy lists, please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion
of Galaxy analysis and other features on the public
server at usegalaxy.org. For discussion of local Galaxy
instances and the Galaxy source code, please use the
Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other
Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] how to find out the gene_ID correspond to CUFF ID

2011-03-01 Thread David Matthews

Hi,

Yeah, thats a good idea too!! I did not know about that tool, shows what I know 
(!) - thanks for the info!

Cheers
David



On 1 Mar 2011, at 04:51, Jeremy Goecks wrote:

> Ying, you could also try using the tools 'Fetch closest non-overlapping 
> feature' and 'Intersect' to find genes nearby transcripts/genes/TSSes of 
> interest; for both tools, you'll want a reference annotation, either from 
> UCSC or Ensembl.
> 
> Best,
> J.
> 
> On Feb 28, 2011, at 6:10 PM, David Matthews wrote:
> 
>> Hi,
>> 
>> You need to supply a gene annotation file with cufflink to easily get the 
>> gene-id information. Without it, cufflinks simply tries its best to figure 
>> out what genes are present. The ensemble gtf file is quite a comprehensive 
>> one - there is a link to it on the cufflinks manual page.
>> 
>> Good luck!
>> David
>> 
>> 
>> 
>> On 28 Feb 2011, at 21:33, Ying Zhang wrote:
>> 
>>> Dear Everyone:
>>> 
>>> I have got one output file after I run Cufflink which contain gene 
>>> expression
>>> information. However, I found out for each gene_ID, it has the format like,
>>> CUFF.1151175, do you have idea of how to find out the offical gene ID
>>> correspond to this CUFF ID? Thank you very much!
>>> 
>>> Best
>>> 
>>> Ying Zhang, M.D., Ph.D.
>>> Postdoctoral Associate
>>> Department of Genetics,
>>> Yale University School of Medicine
>>> 300 Cedar Street,S320
>>> New Haven, CT 06519
>>> Tel: (203)737-2616
>>> Fax: (203)737-2286
>>> ___
>>> The Galaxy User list should be used for the discussion
>>> of Galaxy analysis and other features on the public
>>> server at usegalaxy.org. For discussion of local Galaxy
>>> instances and the Galaxy source code, please use the
>>> Galaxy Development list:
>>> 
>>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>>> 
>>> To manage your subscriptions to this and other
>>> Galaxy lists, please use the interface at:
>>> 
>>> http://lists.bx.psu.edu/
>> 
>> 
>> ___
>> The Galaxy User list should be used for the discussion
>> of Galaxy analysis and other features on the public
>> server at usegalaxy.org. For discussion of local Galaxy
>> instances and the Galaxy source code, please use the
>> Galaxy Development list:
>> 
>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> To manage your subscriptions to this and other
>> Galaxy lists, please use the interface at:
>> 
>> http://lists.bx.psu.edu/
> 


___
The Galaxy User list should be used for the discussion
of Galaxy analysis and other features on the public
server at usegalaxy.org. For discussion of local Galaxy
instances and the Galaxy source code, please use the
Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other
Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] how to find out the gene_ID correspond to CUFF ID

2011-03-01 Thread David Matthews

Hi,

Yes, column 1 refers to the chromosome name and it must be the same throughout 
(i.e. your hg19 reference file must call the chromosomes 1,2, 3 etc). A simpler 
solution is to use a copy of hg19 that lists the chromosomes as 1, 2, 3 etc 
instead of Chr1, Chr2 etc. Unfortunately I'm only in intermittent contact with 
the web - I might be able to help you properly next week when I am back at 
work. However, I've just publicly shared a history containing a hg19 file, a 
female hg19 (missing chromosome Y) and an ensembl gtf file that all work 
together (i.e. all use the same names for the chromosomes!) called "Bristol 
hg19..." just look under "shared data". However, you will probably need to 
repeat your tophat alignments using your reads and these files together. 

Good luck!

David



On 1 Mar 2011, at 20:06, Ying Zhang wrote:

> Dear Vasu:
> 
> thank you for your information!
> 
> I have checked the reference and do not find a specific column that include
> chromosome information, do you mean the first column(seqname)? Do you
> happen to
> have one with correct format and I can used for reference annotation? Thanks a
> lot! I onlg have limited experience in computing so I do not know how
> to format
> this file.
> 
> Best
> 
> Ying
> 
> Quoting vasu punj :
> 
>> I believe you need to format the Ensemble file Chromosome columns is
>> not correct.
>>  
>> Vasu
>> 
>> --- On Tue, 3/1/11, Ying Zhang  wrote:
>> 
>> 
>> From: Ying Zhang 
>> Subject: Re: [galaxy-user] how to find out the gene_ID correspond to CUFF ID
>> To: "David Matthews" 
>> Cc: galaxy-u...@bx.psu.edu
>> Date: Tuesday, March 1, 2011, 10:59 AM
>> 
>> 
>> Dear David:
>> 
>> I followed your advices and downloaded reference sequence  from
>> Emsemble, then I
>> uploaded this file into galaxy, and then I run the cufflinks using
>> the file as a
>> reference annotation, however I got error when I am running, the
>> following the
>> error message gave to me:
>> 
>> An error occurred running this job: cufflinks v0.9.3
>> cufflinks -I 30 -F 0.05 -j 0.05 -p 8 -Q 0 -G
>> /galaxy/main_database/files/002/122/dataset_2122219.dat -r
>> /galaxy/data/hg19/sam_index/hg19.fa
>> Error running cufflinks. [11:47:14] Loading reference and sequence.
>> GFF warning: mergi
>> 
>> Do you have any idea of what is going wrong here?
>> 
>> Best
>> 
>> Ying
>> 
>> 
>> Quoting David Matthews :
>> 
>>> Hi,
>>> 
>>> Yeah, thats a good idea too!! I did not know about that tool, shows
>>> what I know (!) - thanks for the info!
>>> 
>>> Cheers
>>> David
>>> 
>>> 
>>> 
>>> On 1 Mar 2011, at 04:51, Jeremy Goecks wrote:
>>> 
>>>> Ying, you could also try using the tools 'Fetch closest
>>>> non-overlapping feature' and 'Intersect' to find genes nearby
>>>> transcripts/genes/TSSes of interest; for both tools, you'll want a
>>>> reference annotation, either from UCSC or Ensembl.
>>>> 
>>>> Best,
>>>> J.
>>>> 
>>>> On Feb 28, 2011, at 6:10 PM, David Matthews wrote:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> You need to supply a gene annotation file with cufflink to easily
>>>>> get the gene-id information. Without it, cufflinks simply tries
>>>>> its best to figure out what genes are present. The ensemble gtf
>>>>> file is quite a comprehensive one - there is a link to it on the
>>>>> cufflinks manual page.
>>>>> 
>>>>> Good luck!
>>>>> David
>>>>> 
>>>>> 
>>>>> 
>>>>> On 28 Feb 2011, at 21:33, Ying Zhang wrote:
>>>>> 
>>>>>> Dear Everyone:
>>>>>> 
>>>>>> I have got one output file after I run Cufflink which contain
>>>>>> gene expression
>>>>>> information. However, I found out for each gene_ID, it has the
>>>>>> format like,
>>>>>> CUFF.1151175, do you have idea of how to find out the offical gene ID
>>>>>> correspond to this CUFF ID? Thank you very much!
>>>>>> 
>>>>>> Best
>>>>>> 
>>>>>> Ying Zhang, M.D., Ph.D.
>>>>>> Postdoctoral Associate
>>>>>> Department of Genetics,
>>>>>> Yale University School of Medicine

Re: [galaxy-user] downstream analysis of cuffdiff out put

2011-03-10 Thread David Matthews

Hi All,

I agree with this problem and solution. I have a lot of cufflinks, cuffcompare 
and cuffdiff output but I am struggling to relate what this means in terms of 
the real world! I have seen partek software attempt to visualise some of the 
data it generates which appears to be using the FMI data in the cufflinks suite 
but beyond that I struggle. I did have an email conversation with Cole Trapnell 
which eventually centred on the idea that you just have to trust the analysis 
and then go away and do the RT-PCR to check it all out!
So for tools I think:

1. A tool that shows you the layout of known isoforms for a gene and the FMI 
data for each isoform. 

er. thats it for now from me!

But I also struggle to understand what all the other outputs really mean! What 
does the CDS.diff output tell us? What dies the promoters.diff output tell us? 
I know what the cufflinks manual says but I struggle to convert this in my head 
to what is happening to an actual gene so if anyone has a power point example 
on a specific gene of what the data is saying in terms of how this relates to 
changes in protein production - that would be great! I'm hoping someone out 
there has had to lecture on this to students and they have done a powerpoint 
presentation and are willing to show it to the galaxy community.

Another point about the analysis of cufflinks data is the subject of the Pseudo 
Autosomal Regions in X and Y - this will make a mess of gene expression 
analysis in some cases especially because tophat will assign a read to both 
sites and make it a multihit read (which you might then filter out) or it may 
double the true levels of reported expression.. Anyone had thoughts on this?

Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk

On 10 Mar 2011, at 15:55, Jeremy Goecks wrote:

> Jagat,
> 
> Please send queries such as these to the galaxy-user mailing list (cc'd); 
> there are many users on the list who can contribute to this discussion, and 
> there are many additional users that will benefit from this discussion.
> 
>> I was wondering if you can point me to a documentation or URL to guide how 
>> to perform the downstream analysis once we have cuffdiff out put.
> 
> In general, I agree that tools are needed to further process 
> cufflinks/compare/diff outputs, but I'm not aware of any that are publicly 
> available. Let's open this issue up for discussion and see if we can reach a 
> consensus about tools might be useful. Everyone, please feel free to 
> contribute ideas/tools; note that the Galaxy Tool Shed is a nice place for 
> sharing tools you've built for Galaxy:
> 
> http://community.g2.bx.psu.edu/
> 
>> Just like any mRNA-seq experiment to achieve following objectives:
>> 
>> 1.   Reconstruct  all transcripts of a particular gene and corresponding 
>> Cuffdiff  significantly expressed transcripts as called by cuffdiff.
>> 2.   What are different isoforms
>> 3.   Location of splicing
>> 
>> From various output files which unique ID can be matched  from one file say 
>> Cuffdiff.expr (transcript/ isoform/Splicing)  to  other file - 
>> transcript.gtf  corresponding to each sample or combined GTF file.
>> 
> I've got a script that does this for the cuffdiff isoform expression testing 
> file and a GTF file; I'll wrap it up and add it to Galaxy in the next couple 
> weeks. It would probably be useful to have similar scripts for the other 
> expression testing files as well. Also, it would be nice to be able to take 
> the FPKM values generated by Cuffdiff and attach them to their respective 
> transcripts as attributes.
> 
> Best,
> J. 
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http:

[galaxy-user] Pseudo Autosomal regions in Chrs X and Y

2011-03-10 Thread David Matthews

Hi All again,

A separate point about the analysis of cufflinks data is the subject of the 
Pseudo Autosomal Regions in X and Y - this will make a mess of gene expression 
analysis in some cases especially because tophat will assign a read to both 
places which therefore makes it a multihit read (which you might then filter 
out) or it may double the true levels of reported expression. Anyone had 
experience/thoughts on this?

Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Tophat version

2011-03-14 Thread David Matthews

Hi,

Just wondering when the tophat portion of Galaxy will be updated? Its currently 
version 1.1.1 and there is now a version 1.2.0 (in fact I think there have been 
4 updates).

Cheers
David


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] A genbank to gtf converter

2011-03-14 Thread David Matthews

Hi again,

Does anyone know of a genbank to gtf converter? I have heard such things exist 
but never found one...

Cheers
David


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] A genbank to gtf converter

2011-03-28 Thread David Matthews

Hi Jen,

Many thanks for the reply. Sadly my programming is not up to anything like a 
gbk to gtf converter! The main reason I want one is that as a virologist this 
would be very useful since many viruses do not have a gtf file but do have 
genbank submissions. I know of a site that has some viruses listed together 
with GFF files but alas I cannot find a GFF to GTF converter - nightmare!!

I'll keep looking for one and if I find it I'll let you know.

Cheers
David


On 23 Mar 2011, at 18:02, Jennifer Jackson wrote:

> Hello David,
> 
> This is a great idea that the team has been considering adding, but nothing 
> immediate is planned. There are some external teams that are working on 
> outside development, and this is on their list, to.
> 
> If interested in what that project is doing, please see this thread:
> http://lists.bx.psu.edu/pipermail/galaxy-dev/2011-March/004692.html
> 
> For now, if the data resides in a track at UCSC (many are, especially for 
> vertebrate genomes and it is updated daily), using the Table browser can 
> allow you to export the data in GTF and push to Galaxy with the "Get Data" 
> tool. Since some of the data can be large, using BX Main (our local UCSC 
> mirror) may be the best source.
> 
> To do this, navigate to the target genome and track (RefSeq under Gene 
> Predictions, others under Mrna & EST), and choose output format "GTF - gene 
> transfer format". Please note that the "gene_id" attribute in the 9th field 
> will not be populated with the gene name (will be same as transcript_id). 
> This is just how UCSC does it right now (on their list to get the full GTF 
> output set up in the TB, as far as we know). But, to get that info now, go 
> back in and reexport the same table data again as "all fields from selected 
> table" into Galaxy and the gene name will be in the data field named "name2". 
> The text manipulation tools can help to format the data.
> 
> A workflow would be a good option once you have the tool path worked out, so 
> that it can be reused without having to do it all again, for future similar 
> genbank datasets. You may even want to publish the workflow for others to 
> use, as it is very popular request, maybe add published page to explain how 
> to use/prep data for input.
> 
> Apologies for the current inconvenience, but hopefully this can get you going 
> until a more direct method is implemented directly in Galaxy main.
> 
> Great idea that many other users are also very interested in. Any 
> contributions (page, workflow) would be most welcomed. A tool that does the 
> extraction directly from Genbank would also be welcomed in the Tool Shed, if 
> you want to contribute.
> http://community.g2.bx.psu.edu/
> 
> Best,
> 
> Jen
> Galaxy team
> 
> 
> On 3/14/11 1:15 PM, David Matthews wrote:
>> Hi again,
>> 
>> Does anyone know of a genbank to gtf converter? I have heard such things 
>> exist but never found one...
>> 
>> Cheers
>> David
>> 
>> 
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> 
>>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>> 
>>   http://lists.bx.psu.edu/
> 
> -- 
> Jennifer Jackson
> http://usegalaxy.org
> http://galaxyproject.org


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Pseudo Autosomal regions in Chrs X and Y

2011-03-28 Thread David Matthews

Hi,

Again, thanks for the feedback. I made my own female hg19 by deleting chrY from 
my copy of hg19 so thats OK. It still leaves the problem of how to analyse male 
transcriptomes since maps to PAR1 and 2 genes get reported as multimap reads 
which can end up being filtered out depending on how you analyse your 
transcriptome. If I knew with certainty where PAR1 and 2 are on chrY of hg19 I 
was planning to replace the nucleotides with N's on chrY so that they would no 
longer show up as a multimap problem - do you (or anyone else) happen to know 
the co-ordinates on hg19?

Cheers
David

On 23 Mar 2011, at 14:19, Jennifer Jackson wrote:

> Hi David,
> 
> Right now we don't have anything built-in to filter out this type of 
> duplication automatically.
> 
> As a potential option, did you know that we offer a "Canonical Female" build 
> for certain genomes? This may help with some of the duplication issues, if 
> the loss of novel Y is OK for your project.
> 
> Please see:
> https://bitbucket.org/galaxy/galaxy-central/wiki/GenomeData
> 
> Thanks for bringing up a good point!
> 
> Best,
> Jen
> 
> 
> On 3/10/11 8:44 AM, David Matthews wrote:
>> Hi All again,
>> 
>> A separate point about the analysis of cufflinks data is the subject of
>> the Pseudo Autosomal Regions in X and Y - this will make a mess of gene
>> expression analysis in some cases especially because tophat will assign
>> a read to both places which therefore makes it a multihit read (which
>> you might then filter out) or it may double the true levels of reported
>> expression. Anyone had experience/thoughts on this?
>> 
>> Best Wishes,
>> David.
>> 
>> __
>> Dr David A. Matthews
>> 
>> Senior Lecturer in Virology
>> Room E49
>> Department of Cellular and Molecular Medicine,
>> School of Medical Sciences
>> University Walk,
>> University of Bristol
>> Bristol.
>> BS8 1TD
>> U.K.
>> 
>> Tel. +44 117 3312058
>> Fax. +44 117 3312091
>> 
>> d.a.matth...@bristol.ac.uk <mailto:d.a.matth...@bristol.ac.uk>
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> 
>>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>> 
>>   http://lists.bx.psu.edu/
> 
> -- 
> Jennifer Jackson
> http://usegalaxy.org
> http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Pseudo Autosomal regions in Chrs X and Y

2011-03-29 Thread David Matthews

Fantastic, many thanks!


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 29 Mar 2011, at 00:19, Jennifer Jackson wrote:

> Hi David,
> 
> The PAR regions are documented at UCSC on the hg19 genome gateway page (and 
> for some other recent genomes). Start at the main page, click into Genomes, 
> select hg19, then scroll down to credits:
> 
> http://genome.ucsc.edu/
> 
> quote:
> 
> The Y chromosome in this assembly contains two pseudoautosomal regions (PARs) 
> that were taken from the corresponding regions in the X chromosome and are 
> exact duplicates:
> 
> chrY:10001-2649520 and chrY:59034050-59363566
> chrX:60001-2699520 and chrX:154931044-155260560
> 
> Hopefully this helps!
> Jen
> 
> On 3/28/11 2:04 PM, David Matthews wrote:
>> Hi,
>> 
>> Again, thanks for the feedback. I made my own female hg19 by deleting chrY 
>> from my copy of hg19 so thats OK. It still leaves the problem of how to 
>> analyse male transcriptomes since maps to PAR1 and 2 genes get reported as 
>> multimap reads which can end up being filtered out depending on how you 
>> analyse your transcriptome. If I knew with certainty where PAR1 and 2 are on 
>> chrY of hg19 I was planning to replace the nucleotides with N's on chrY so 
>> that they would no longer show up as a multimap problem - do you (or anyone 
>> else) happen to know the co-ordinates on hg19?
>> 
>> Cheers
>> David
>> 
>> 
>> On 23 Mar 2011, at 14:19, Jennifer Jackson wrote:
>> 
>>> Hi David,
>>> 
>>> Right now we don't have anything built-in to filter out this type of 
>>> duplication automatically.
>>> 
>>> As a potential option, did you know that we offer a "Canonical Female" 
>>> build for certain genomes? This may help with some of the duplication 
>>> issues, if the loss of novel Y is OK for your project.
>>> 
>>> Please see:
>>> https://bitbucket.org/galaxy/galaxy-central/wiki/GenomeData
>>> 
>>> Thanks for bringing up a good point!
>>> 
>>> Best,
>>> Jen
>>> 
>>> 
>>> On 3/10/11 8:44 AM, David Matthews wrote:
>>>> Hi All again,
>>>> 
>>>> A separate point about the analysis of cufflinks data is the subject of
>>>> the Pseudo Autosomal Regions in X and Y - this will make a mess of gene
>>>> expression analysis in some cases especially because tophat will assign
>>>> a read to both places which therefore makes it a multihit read (which
>>>> you might then filter out) or it may double the true levels of reported
>>>> expression. Anyone had experience/thoughts on this?
>>>> 
>>>> Best Wishes,
>>>> David.
>>>> 
>>>> __
>>>> Dr David A. Matthews
>>>> 
>>>> Senior Lecturer in Virology
>>>> Room E49
>>>> Department of Cellular and Molecular Medicine,
>>>> School of Medical Sciences
>>>> University Walk,
>>>> University of Bristol
>>>> Bristol.
>>>> BS8 1TD
>>>> U.K.
>>>> 
>>>> Tel. +44 117 3312058
>>>> Fax. +44 117 3312091
>>>> 
>>>> d.a.matth...@bristol.ac.uk<mailto:d.a.matth...@bristol.ac.uk>
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ___
>>>> The Galaxy User list should be used for the discussion of
>>>> Galaxy analysis and other features on the public server
>>>> at usegalaxy.org.  Please keep all replies on the list by
>>>> using "reply all" in your mail client.  For discussion of
>>>> local Galaxy instances and the Galaxy source code, please
>>>> use the Galaxy Development list:
>>>> 
>>>>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>>>> 
>>>> To manage your subscriptions to this and other Galaxy lists,
>>>> please use the interface at:
>>>> 
>>>>   http://lists.bx.psu.edu/
>>> 
>>> --
>>> Jennifer Jackson
>>> http://usegalaxy.org
>>> http://galaxyproject.org
>> 
>> 
> 
> -- 
> Jennifer Jackson
> http://usegalaxy.org
> http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] output file for bowtie suppressed reads (-m/--max)

2011-03-29 Thread David Matthews

Hi,

I have a workflow on the published workflow list (called "Bristol workflow to 
get") which takes sam files and separates the unique from the non unique 
mapped reads. This might help in the sense I have used this kind of approach to 
count up the number of unique vs non uniquely mapped reads. Let me know if you 
need more info.

Cheers,
David

On 23 Mar 2011, at 02:30, karlerh...@berkeley.edu wrote:

> 
> Hi all,
> 
> I'm running into a problem with the output from bowtie mapping for
> illumina reads.  I've been testing bowtie with a subset of my illumina
> reads with the intention of estimating the percent of *uniquely* mapping
> reads represented in my library, ie., non-repetitive reads.
> 
> I've run bowtie on this small subset (~2000 reads) with and without the -m
> option specified with n=1 and I get many fewer mappable reads with -m
> specified, but if I ask galaxy to "Write all reads with a number of valid
> alignments exceeding the limit set with the -m option to a file (--max)",
> the file always comes up empty.
> 
> Where did all those suppressed reads go?
> 
> thanks for any help you can offer,
> 
> karl
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Around Pittsburgh on April 6? Attend the Intro to Galaxy Sessions @ Pitt

2011-03-31 Thread David Matthews

I would be very keen to see this as a webcast or similar - I'd even stay up 
late to watch it!


On 31 Mar 2011, at 20:19, Dave Clements wrote:

> Hello all,
> 
> Dan Blankenberg will be giving two workshops on Galaxy at the University of 
> Pittsburgh on April 6.  The presentations are open to the public.  See below 
> for details and please contact Dan, or Carrie Iwema at Pitt, if you have any 
> questions.
> 
> Thanks,
> 
> Dave C.
> 
> Intro to Galaxy
> http://galaxy.psu.edu/ 
> 
> Dan Blankenberg, PhD
> Center for Comparative Genomics & Bioinformatics
> Penn State University
> 
> Galaxy allows you to do analyses you cannot do anywhere else without the need 
> to install or download anything. 
> You can analyze multiple alignments, compare genomic annotations, profile 
> metagenomic samples & more...
> 
> 
> Wednesday 6th April
> 
> 10 am – 12 pmIntro to Galaxy (general interest)
> 
> 2 pm - 4 pmWorking w/NGS Data (advanced users)
> 
> 
> 
> University of Pittsburgh
> 
> Falk Library
> 
> Conference Room B
> 
> 
> You are welcome to bring your laptop.
> 
> 
> Carrie L. Iwema, PhD, MLS
> Information Specialist in Molecular Biology
> 
> Health Sciences Library System
> University of Pittsburgh
> 200 Scaife Hall
> 3550 Terrace St
> Pittsburgh, PA  15261
> 
> 412-383-6887
> 412-648-8819 (fax)
> iw...@pitt.edu
> www.hsls.pitt.edu/molbio
> 
> -- 
> http://galaxy.psu.edu/gcc2011/
> http://getgalaxy.org
> http://usegalaxy.org/
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Assemble a consensus genome from NGS data

2011-04-04 Thread David Matthews

Hi,

Does anyone know how to get a consensus genome from NGS data indicating the 
percent variance at each nucleotide? I have a small virus genome with manyfold 
coverage from my transcriptomic run. I'd like to know what the transcriptome 
indicates is the actual genome plus get a feel for any hotspots where there 
appears to be significant varience from the reference sequence (i.e. because 
the reference is wrong or perhaps because of frequent errors in that region due 
to RNA pol II having a problem accurately transcribing the sequence).

Many thanks!

David


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] [galaxy-bugs] Cufflinks

2011-04-07 Thread David Matthews

Hi,

I agree with Jeremy on this one, you should use tophat as that is what 
cufflinks is expecting - particularly the fact that the samfile is sorted 
correctly which I'm not certain bowtie does. The other thing that may be wrong 
is that your chromosome annotation may not match if you used a different 
reference genome for the alignment compared to the reference genome you told 
cufflinks to use (I did that once!). As Jeremy says cufflinks should be OK 
without a gene annotation file.

Cheers
David

On 7 Apr 2011, at 16:11, Jeremy Goecks wrote:

> Hi Gabor,
> 
> I'm moving your email to the galaxy-user mailing list because it concerns 
> galaxy usage; also, there's a substantial community of users doing RNA-seq 
> that may be able to offer suggestions to help you out.
> 
> To your issue:
> 
>> I used Galaxy (Bowtie) to successfully map 15 million Illumina reads trimmed 
>> to 65bp.  When I applied Cufflinks to the BAM data no transcripts were 
>> reported even though it ran OK.  It is possible that very little mapped to 
>> known exons in this data set.  Does Cufflinks only report data for known 
>> transcripts?  I thought it was designed to work without a reference 
>> annotation.  How does it decide what qualifies as a transcript?  Any ideas 
>> why I got such a result?
> 
> Your problems are likely at least partially due to using Bowtie instead of 
> Tophat. Tophat is the standard way to map reads so that Cufflinks can 
> assemble transcripts. Cufflinks assembles transcript--de novo or 
> reference-guided--based on the mapped reads, but mapped reads must include 
> spliced reads, which are generated by Tophat but not Bowtie.
> 
> Good luck,
> J.
> 
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Assemble a consensus genome from NGS data

2011-04-08 Thread David Matthews

HI Ben,

Do not apologise, this is excellent guidance! I have been bumbling about with 
pile up and your explanation makes it much clearer. I did not use BWA but 
tophat instead so I'll give it a go with bwa and see if it makes a difference. 
I'm off to a virology conference next week so I'm not sure how much chance I'll 
get to work on it but many thanks again and once I do get my teeth into it I'm 
sure I'll have some more questions - especially on the stats front. On a 
related subject I am also looking at indels to see if the virus has hotspots 
for transcription errors that may reflect a deliberate attempt by the virus to 
modulate RNApolII function through secondary RNA structure interfering with 
polII fidelity (I have no other evidence for this, just a mad shot in the 
dark!). Have you ever looked at this either?

Best Wishes,
David

On 8 Apr 2011, at 05:08, Benjamin Dickins wrote:

> Hi David,
> I'm sorry for a slow response. Relatively recently I solved a problem a bit 
> like this and would be happy to share more information with you. If your 
> genome is small I think it makes sense to map to a reference and identify 
> variant sites. (In my opinion de novo assembly isn't needed - see below).
> 
> A basic approach is: groom FASTA file -> map with BWA -> filter SAM (uniquely 
> mapped reads only) -> SAM-to-BAM -> Generate pileup -> Filter pileup
> 
> This gives you a position-by-position summary relative to the reference. And 
> that last step is important and needs the most care: you can have it print 
> out differences total numbers of non-reference bases. I can share some 
> information about thresholding how many of these constitute significant 
> evidence that a non-reference base is actually there at that position 
> (basically I use a binomial distribution and ask whether the distribution of 
> ref/non-ref would occur by chance). Given that coverage of small genomes 
> tends to be high, your first question about determining the actual genome 
> sequence (or the quasispecies consensus if you prefer!) can be answered by 
> majority rules: i.e., a small script (or with tools under "Text Manipulation" 
> heading) to read off the base with the most support at each position and then 
> to test whether that base == base in reference nucleotide column.
> 
> It's probably also worth thinking about PCR duplicates (from library prep) as 
> these could be a significant source of error, but they are also tricky when 
> many reads will be identical anyway in the input DNA.
> 
> Feel free to get in touch with me if you need a bit more clarity and/or some 
> more specifics...
> 
> cheers,
> Ben
> 
> On Apr 4, 2011, at 9:55 PM, Anton Nekrutenko wrote:
> 
>>> From: David Matthews 
>>> Date: April 4, 2011 6:02:03 PM EDT
>>> To: galaxy-user@lists.bx.psu.edu
>>> Subject: [galaxy-user] Assemble a consensus genome from NGS data
>>> 
>>> Hi,
>>> 
>>> Does anyone know how to get a consensus genome from NGS data indicating the 
>>> percent variance at each nucleotide? I have a small virus genome with 
>>> manyfold coverage from my transcriptomic run. I'd like to know what the 
>>> transcriptome indicates is the actual genome plus get a feel for any 
>>> hotspots where there appears to be significant varience from the reference 
>>> sequence (i.e. because the reference is wrong or perhaps because of 
>>> frequent errors in that region due to RNA pol II having a problem 
>>> accurately transcribing the sequence).
>>> 
>>> Many thanks!
>>> 
>>> David
>>> 
>>> 
>>> ___
>>> The Galaxy User list should be used for the discussion of
>>> Galaxy analysis and other features on the public server
>>> at usegalaxy.org.  Please keep all replies on the list by
>>> using "reply all" in your mail client.  For discussion of
>>> local Galaxy instances and the Galaxy source code, please
>>> use the Galaxy Development list:
>>> 
>>>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>>> 
>>> To manage your subscriptions to this and other Galaxy lists,
>>> please use the interface at:
>>> 
>>>  http://lists.bx.psu.edu/
>> 
> 
> Benjamin Dickins
> Postdoctoral Researcher
> Center for Comparative Genomics and Bioinformatics
> The Pennsylvania State University
> 
> 302 Wartik Laboratory
> University Park, PA 16802, USA
> Cell/mobile: +1 814 777 1852
> Office tel: +1 814 863 2185
> Office fax: +1 8

Re: [galaxy-user] Workflows with run time parameters broken?

2011-04-20 Thread David Matthews

Hi,

I've followed this thread with interest. I have had a problem on one of my 
workflows, it fails to perform a sort on a dataset returning the error "sort: 
invalid number at field start: invalid count at start of `None,Nonen'". 
However, this error makes no sense. When I ask it to run the job again with the 
input it is supposed to sort it works just fine. When the workflow is getting 
ready to be submitted I've checked and double checked all the links and they 
are correct too. Is my error related in some way? Does this make sense?

Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk

On 20 Apr 2011, at 15:47, Dannon Baker wrote:

> The issue you're seeing is actually related to improper initialization of the 
> workflow parameters index in the special case where a particular parameter is 
> only used in a rename action and not in a tool step.  This is fixed in 
> revision 5400:6cacf178a129.
> 
> -Dannon
> 
> 
> On Apr 20, 2011, at 10:28 AM, Peter Cock wrote:
> 
>> Hi all,
>> 
>> I noticed a new problem with our local Galaxy (recently updated),
>> but found it happens on the public installation too.
>> 
>> I can edit workflows, and create run time parameters like ${Name}
>> or ${Name with spaces}, but when I come to run the workflow I get
>> an error:
>> 
>>> Server Error
>>> An error occurred. See the error logs for more information.
>>> (Turn debug on to display exception reports here)
>> 
>> e.g. This trivial example takes a FASTA file, does a minimum length
>> filter, and tries to rename the output to a run time parameter string:
>> 
>> http://main.g2.bx.psu.edu/u/peterjc/w/test-workflow-filter-fasta-min-length-10
>> 
>> Is this a known issue? I couldn't find a match on the bitbucket
>> tracker.
>> 
>> Thanks,
>> 
>> Peter
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> 
>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>> 
>> http://lists.bx.psu.edu/
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Workflows with run time parameters broken?

2011-04-20 Thread David Matthews

Done, many thanks...


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 20 Apr 2011, at 17:44, Dannon Baker wrote:

> I'll take a look at the workflow in question, if you'd like to share it with 
> me at this email address, 'dannonba...@me.com'.
> 
> -Dannon
> 
> 
> On Apr 20, 2011, at 12:42 PM, David Matthews wrote:
> 
>> Hi,
>> 
>> I've followed this thread with interest. I have had a problem on one of my 
>> workflows, it fails to perform a sort on a dataset returning the error 
>> "sort: invalid number at field start: invalid count at start of 
>> `None,Nonen'". However, this error makes no sense. When I ask it to run the 
>> job again with the input it is supposed to sort it works just fine. When the 
>> workflow is getting ready to be submitted I've checked and double checked 
>> all the links and they are correct too. Is my error related in some way? 
>> Does this make sense?
>> 
>> 
>> Best Wishes,
>> David.
>> 
>> __
>> Dr David A. Matthews
>> 
>> Senior Lecturer in Virology
>> Room E49
>> Department of Cellular and Molecular Medicine,
>> School of Medical Sciences
>> University Walk,
>> University of Bristol
>> Bristol.
>> BS8 1TD
>> U.K.
>> 
>> Tel. +44 117 3312058
>> Fax. +44 117 3312091
>> 
>> d.a.matth...@bristol.ac.uk
>> 
>> 
>> 
>> 
>> 
>> 
>> On 20 Apr 2011, at 15:47, Dannon Baker wrote:
>> 
>>> The issue you're seeing is actually related to improper initialization of 
>>> the workflow parameters index in the special case where a particular 
>>> parameter is only used in a rename action and not in a tool step.  This is 
>>> fixed in revision 5400:6cacf178a129.
>>> 
>>> -Dannon
>>> 
>>> 
>>> On Apr 20, 2011, at 10:28 AM, Peter Cock wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> I noticed a new problem with our local Galaxy (recently updated),
>>>> but found it happens on the public installation too.
>>>> 
>>>> I can edit workflows, and create run time parameters like ${Name}
>>>> or ${Name with spaces}, but when I come to run the workflow I get
>>>> an error:
>>>> 
>>>>> Server Error
>>>>> An error occurred. See the error logs for more information.
>>>>> (Turn debug on to display exception reports here)
>>>> 
>>>> e.g. This trivial example takes a FASTA file, does a minimum length
>>>> filter, and tries to rename the output to a run time parameter string:
>>>> 
>>>> http://main.g2.bx.psu.edu/u/peterjc/w/test-workflow-filter-fasta-min-length-10
>>>> 
>>>> Is this a known issue? I couldn't find a match on the bitbucket
>>>> tracker.
>>>> 
>>>> Thanks,
>>>> 
>>>> Peter
>>>> ___
>>>> The Galaxy User list should be used for the discussion of
>>>> Galaxy analysis and other features on the public server
>>>> at usegalaxy.org.  Please keep all replies on the list by
>>>> using "reply all" in your mail client.  For discussion of
>>>> local Galaxy instances and the Galaxy source code, please
>>>> use the Galaxy Development list:
>>>> 
>>>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>>>> 
>>>> To manage your subscriptions to this and other Galaxy lists,
>>>> please use the interface at:
>>>> 
>>>> http://lists.bx.psu.edu/
>>> 
>>> ___
>>> The Galaxy User list should be used for the discussion of
>>> Galaxy analysis and other features on the public server
>>> at usegalaxy.org.  Please keep all replies on the list by
>>> using "reply all" in your mail client.  For discussion of
>>> local Galaxy instances and the Galaxy source code, please
>>> use the Galaxy Development list:
>>> 
>>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>>> 
>>> To manage your subscriptions to this and other Galaxy lists,
>>> please use the interface at:
>>> 
>>> http://lists.bx.psu.edu/
>> 
> 

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] SNPs, Indels and so on in virus genomes

2011-05-03 Thread David Matthews

Hi,

I recently sent out an email asking if anyone knew much about analysis of SNPs 
etc and how to visualise them. I got some very useful answers and planned to 
return to the problem when I got a better chance to work on this in some depth. 
I now have the time but, like an idiot, I've accidentally deleted those email 
replies! So can I please ask again, does anyone have experience of SNP analysis 
and, especially, visualisation that can hold my hand whilst I work this out 
(apologies to the ones who replied last time but can you get in touch again !)?


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk




___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] RNA seq analysis

2011-05-06 Thread David Matthews

Hi,

I have done exactly the same kind of thing for adenovirus so I can help with 
it. In answer to question 1 you do not need to index it will be done for you 
when tophat is called. Secondly you should leave the 40 multihits as it is and 
post analysis filter out the multihits - this will allow you to determine if 
you do have a multihit problem or not and if so whether it is a big problem and 
where it is on the genome. I have a workflow on Galaxy which you can use called 
"Bristol workflow to get sorted unique proper pair mapped reads". If you plug 
in your sam file it should give you files listing only unique hits and those 
which map more than once. This workflow assumes you have paired end data but it 
can be modified to work with single end reads as well.

Hope this helps.

Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk

On 6 May 2011, at 17:09, puvan...@umn.edu wrote:

> Hi
> 
> I have a couple of questions regarding RNA seq analysis. My questions are
> 1.I need to use a viral genome (very small, ~2kb ) as a reference genome and 
> it is not available in Galaxy. I guess I can use this data from my history. I 
> have a fasta file but I am not sure whether I have to do some kind of 
> indexing or not.
> 
> 2. In Tophat, default for "maximum number of alignments to be allowed" is 40. 
> What my understanding is a single read can be aligned maximum 40 different 
> places. I am wondering why this is 40. Is there any specific reason? If I 
> need unique mapping, I have to use 1 instead of 40. Am I correct?
> 
> 
> Thanks
> 
> SP
> 
> 
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
> http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
> http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Converting transcriptomes to proteomes

2011-06-09 Thread David Matthews

Dear Galaxy users,

I am trying to modify the human proteome based on my transcriptomeics data. In 
short I want to use my transcriptomics data to identify snps and from that 
identify coding changes that result from the snps. Ultimately I'd like to 
create a customised canonical proteome based on my transcriptomic data. Does 
anyone know how this might be done in Galaxy? I have started by running a 
pileup and so on but I am not a human geneticist (I am a virologist) so I may 
be making some fundamental errors!!

Any help is gratefully received!


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Aligning against Multiple Reference Sequences

2011-06-09 Thread David Matthews

Hi John,

Probably the simplest thing for you to do would be to concatenate the two 
genomes one after the other using the concatenate tool under "text 
manipulation". This will generate a new organism with apparently two 
chromosomes one from bacteria A and one from bacteria B. When you run tophat or 
bowtie the sam file will indicate which "chromosome" (i.e. which bacteria) it 
assigned the read to.

Hope this helps.

Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk

On 9 Jun 2011, at 15:18, John David Osborne wrote:

> Are there any tools in Galaxy to align short reads against multiple reference 
> sequences?
>  
> I have a dozen microbial genomes sequenced for which there are 2 reference 
> genomes already sequenced. We have tried aligning each of these individually 
> against either of the reference genomes - some align better against the first 
> reference genome, some align better against the second reference genome. 
> Ideally though I would like to be able to align against both at the same 
> time. Is this possible?
>  
> I have found a tool called GenomeMapper and hints of 2 other tools in 
> development that do something like this, but nothing for Galaxy yet.
>  
> How do others proceed with this type of problem? Workflows appreciated! :)
>  
>  -John
>  
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Converting transcriptomes to proteomes

2011-06-09 Thread David Matthews

Hi John (and Pablo),

Thanks for the ideas. I assume this is similar to the "aaChanges" tool in the 
standard galaxy setup which I am working with. If I have no luck with that I'll 
look into the SNPEff tool - is it in the Galaxy toolshed? 
I think that, assuming most of the snps are previously characterised, I can 
probably get such a list by long winded means. Fingers crossed!

Cheers
David

On 9 Jun 2011, at 19:18, John David Osborne wrote:

> Hi David,
> 
> I’ve successfully used SNPEff (which can fit into galaxy) to make SNP effect 
> predictions that would effect the proteome, but from genomic not 
> transcriptomic data but I think it might still work on that...
> 
> It takes in pileup/vcf format and predicts coding changes, upstream changes, 
> splice acceptor/donor effects, etc...
> 
> However I don’t think it will re-create an entire proteome for you, ie) it 
> won’ t output the new set of proteins in FASTA format or anything like that. 
> I don’t know of any tool that does that, but it would be nice!
> 
> I cc’d the author of SNPEff in case I am misrepresenting.
> 
>  -John
> 
> 
> 
> On 6/9/11 6:23 AM, "David Matthews"  wrote:
> 
> Dear Galaxy users,
> 
> I am trying to modify the human proteome based on my transcriptomeics data. 
> In short I want to use my transcriptomics data to identify snps and from that 
> identify coding changes that result from the snps. Ultimately I'd like to 
> create a customised canonical proteome based on my transcriptomic data. Does 
> anyone know how this might be done in Galaxy? I have started by running a 
> pileup and so on but I am not a human geneticist (I am a virologist) so I may 
> be making some fundamental errors!!
> 
> Any help is gratefully received!
> 
> 
> Best Wishes,
> David.
> 
> __
> Dr David A. Matthews
> 
> Senior Lecturer in Virology
> Room E49
> Department of Cellular and Molecular Medicine,
> School of Medical Sciences
> University Walk,
> University of Bristol
> Bristol.
> BS8 1TD
> U.K.
> 
> Tel. +44 117 3312058
> Fax. +44 117 3312091
> 
> d.a.matth...@bristol.ac.uk
> 
> 
> 
> 
> 
> 
> 

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] hg19 and hg19patch2

2011-06-10 Thread David Matthews

Dear Galaxy-users,

Does anyone know what the differences are between hg19 and hg19patch2 and can 
anyone tell me if the latest ensembl gtf file (v62) is definitely compatible 
with both hg19 and hg19patch2?


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 10 Jun 2011, at 11:39, Michal Stuglik wrote:

> 
> 
> Hi Jen,
> 
> It works, thanks! 
> 
> I am wondering why using Text Manipulation/Compute function, galaxy changes 
> brackets '[' to '__ob__' and '__cb__' for ']', so for this: str(c1)[1:2] --> 
> str(c1)__ob__1:2__cb__  
> 
> thanks a lot,
> michal
> 
>> Hi Michal, 
>> 
>> The tool "Fetch Sequences -> Extract Genomic DNA" can be used to extract 
>> fasta sequences. The coordinates can be BED, GTF, etc. and the "genome" 
>> doesn't necessarily have to be an actual genome, just a fasta file in your 
>> history. 
>> 
>> To subset a data string, the tool "Text Manipulation -> Trim" might be 
>> helpful. This would only work if you want to use the same rules for an 
>> entire file (or split your file up and run the tool on those subfiles using 
>> different rules). Practical for some cases, but not all. 
>> 
>> And the final option is for coordinate data - tools in "Operate on Genomic 
>> Intervals". Once you have the final coordinate set, going back and using the 
>> "Fetch Sequences" tool can capture the associated result fasta sequence, 
>> from a native genome or a fasta file in your history, as described above. 
>> 
>> Hopefully this gives you an option that will work for your project, 
>> 
>> Best, 
>> 
>> Jen 
>> Galaxy team 
>> 
>> On 6/5/11 7:14 AM, Michal Stuglik wrote: 
>>> 
>>> Hi all, 
>>> 
>>> I am wondering if galaxy has tool to substring/extract sequence/text 
>>> from another sequence/text based on coordinates in columns (start, end 
>>> column) or how to do it in Text Manipulation/Compute? 
>>> 
>>> all the best, 
>>> michal 
>>> 
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Cufflinks advice

2011-06-15 Thread David Matthews

Hi,

As a first guess I would say that your chromosome names do not match somewhere 
along the line. If you look at your sam file and the fasta of the genome you 
are working with (and the gtf file as well if you are using it) you may find, 
for example, one refers to chromosome 1 as "chr1" whilst the other refers to 
chromosome 1 as "1" or even "Chr1" or some other way of referring to the 
chromosome - any of these mismatches can cause you to get an empty output. If 
you are using a built in index it may be you need to change your chromosome 
names from "1" to "chr1" for example. Amazingly, the names of human chromosomes 
are apparently not yet standardised across all databases for the human genome 
(and I presume this may be the case for other genomes as well). 

Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk

On 15 Jun 2011, at 02:22, Michael Gooch wrote:

> I attempted to run cufflinks on some RNA sequencing data. It seemed to 
> complete without any errors, but the output files were empty. I am trying to 
> figure out if I did something wrong or whether my data needs some additional 
> processing before cufflinks will be able to use it. (Or whether the data is 
> unsuitable for  cufflinks.) The data is paired end reads.
> 
> M. Gooch
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
> http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
> http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Looking for new transcripts with cufflinks

2011-07-04 Thread David Matthews

Hi,

I am working with HeLa cells and want to know how to get cufflinks etc. to 
highlight if a region of the genome is being transcribed that is not in the 
ensembl gtf. I know that cufflinks highlights with class code "j" regions that 
do not match a known gene and therefore may be novel but most of these arise 
from transcription on or near known genes. Does anyone know how to look for 
transcription that is clearly distinct from known genes? This is a wild goose 
chase but worth a peek just in case...


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Looking for new transcripts with cufflinks

2011-07-04 Thread David Matthews

Of course - Doh! Many thanks!!


On 4 Jul 2011, at 18:47, Oliver, Gavin wrote:

> "u" represents unknown intergenic transcripts.
> 
> 
> -Original Message-
> From: galaxy-user-boun...@lists.bx.psu.edu on behalf of David Matthews
> Sent: Mon 04/07/2011 17:48
> To: galaxy-user@lists.bx.psu.edu
> Subject: [galaxy-user] Looking for new transcripts with cufflinks
> 
> Hi,
> 
> I am working with HeLa cells and want to know how to get cufflinks etc. to 
> highlight if a region of the genome is being transcribed that is not in the 
> ensembl gtf. I know that cufflinks highlights with class code "j" regions 
> that do not match a known gene and therefore may be novel but most of these 
> arise from transcription on or near known genes. Does anyone know how to look 
> for transcription that is clearly distinct from known genes? This is a wild 
> goose chase but worth a peek just in case...
> 
> 
> Best Wishes,
> David.
> 
> __
> Dr David A. Matthews
> 
> Senior Lecturer in Virology
> Room E49
> Department of Cellular and Molecular Medicine,
> School of Medical Sciences
> University Walk,
> University of Bristol
> Bristol.
> BS8 1TD
> U.K.
> 
> Tel. +44 117 3312058
> Fax. +44 117 3312091
> 
> d.a.matth...@bristol.ac.uk
> 
> 
> 
> 
> 
> The contents of this message and any attachments to it are confidential and 
> may be legally privileged. If you have received this message in error, you 
> should delete it from your system immediately and advise the sender.
> 
> Almac Group (UK) Limited, registered no. NI061368.  Almac Sciences Limited, 
> registered no. NI041550.  Almac Discovery Limited, registered no. NI046249.  
> Almac Pharma Services Limited, registered no. NI045055.  Almac Clinical 
> Services Limited, registered no. NI041905.  Almac Clinical Technologies 
> Limited, registered no. NI061202.  Almac Diagnostics Limited, registered no. 
> NI043067.  All preceding companies are registered in Northern Ireland with a 
> registered office address of Almac House, 20 Seagoe Industrial Estate, 
> Craigavon, BT63 5QD, UK.
> 
> Almac Sciences (Scotland) Limited, registered in Scotland no. SC154034.
> 
> Almac Clinical Services LLC, Almac Clinical Technologies LLC and Almac 
> Diagnostics LLC are Delaware limited liability companies and Almac Group 
> Incorporated is a Delaware Corporation.  More information on the Almac Group 
> can be found on the Almac website: www.almacgroup.com
> 


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] calculating percent coverage over the target genome

2011-07-22 Thread David Matthews

Hi

Does anyone know how to calculate how much of a genome was covered by an 
alignment irrespective of the depth at each base?

Cheers
David


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] generating a new fasta from a pileup

2011-07-22 Thread David Matthews

Hi

On a separate issue, I have been having trouble generating a corrected fasta 
file based on a pileup. I have a dataset that is a resequenced genome and I 
want to correct the fasta file based on the consensus and then re run the 
alignments to see how it affects things. However, I cannot for the life of me 
figure out how to do it in Galaxy. Any help appreciated!

David



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] calculating percent coverage over the target genome

2011-07-28 Thread David Matthews

Hi Jen,

Many thanks for this, on a related subject do you know of a way to correct a 
FASTA file on the basis of a pileup (or even just on the BAM file)?


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 25 Jul 2011, at 17:52, Jennifer Jackson wrote:

> Hello David,
> 
> To calculate coverage, please see the tool "Regional Variation -> Feature 
> coverage". Query and target must both be in Interval/BED format. Query data 
> in Interval/BED format is possible in most of the dataflow paths through the 
> tools and from external sources. The reference genome file will likely need 
> to be imported and formatted.
> 
> This is simple example history where I pulled the chromInfo file from UCSC 
> and formatted, extracted a subset of genes in BED format, and ran the 
> "Feature Coverage" tool (both directions, see datasets 8 and 9).
> 
> http://main.g2.bx.psu.edu/u/jen-bx-galaxy-edu/h/galaxy-user-calculating-percent-coverage-over-the-target-genome-7-22
> 
> Hopefully this helps,
> 
> Jen
> Galaxy team
> 
> On 7/22/11 12:32 PM, David Matthews wrote:
>> Hi
>> 
>> Does anyone know how to calculate how much of a genome was covered by an 
>> alignment irrespective of the depth at each base?
>> 
>> Cheers
>> David
>> 
>> 
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> 
>>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>> 
>>   http://lists.bx.psu.edu/
> 
> -- 
> Jennifer Jackson
> http://usegalaxy.org/
> http://galaxyproject.org/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] generating a new fasta from a pileup

2011-08-01 Thread David Matthews

Hi John,

That would be totally fantastic - many thanks!


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 30 Jul 2011, at 16:35, John Nash wrote:

> I have some code which can do most of the requested things. Let me figure out 
> how to galaxy around it, and I'll submit it.
> 
> John
> 
> Sent from my mobile device
> 
> On 2011-07-30, at 12:47 AM, Jennifer Jackson  wrote:
> 
>> Hello David,
>> 
>> Generating a consensus fasta sequence from a BAM or Pile-up file is not yet 
>> possible in Galaxy. To date, the Tool Shed also does not have a 
>> wrapped/novel tool for this function either.
>> 
>> If you or another user were to create such a wrapped tool, it would be most 
>> welcome. As would a tool that would replace the corresponding region of the 
>> reference genome with the variant fasta sequence to create a novel reference 
>> for alignments.
>> 
>> Both great ideas that have been discussed a few times on the list and here 
>> among our team. If you wanted to open a bitbucket ticket, that would be one 
>> way to share exactly what you had in mind and give you a ticket to watch for 
>> if/when tools like this are added. Or, I can open one (or possibly two, one 
>> for each function) for you, just let me know.
>> 
>> https://bitbucket.org/galaxy/galaxy-central/issues?status=new&status=open
>> 
>> Thanks for the great feedback, sorry there wasn't a solution (yet!),
>> 
>> Best,
>> 
>> Jen
>> Galaxy team
>> 
>> 
>> On 7/22/11 12:56 PM, David Matthews wrote:
>>> Hi
>>> 
>>> On a separate issue, I have been having trouble generating a corrected 
>>> fasta file based on a pileup. I have a dataset that is a resequenced genome 
>>> and I want to correct the fasta file based on the consensus and then re run 
>>> the alignments to see how it affects things. However, I cannot for the life 
>>> of me figure out how to do it in Galaxy. Any help appreciated!
>>> 
>>> David
>>> 
>>> 
>>> 
>>> ___
>>> The Galaxy User list should be used for the discussion of
>>> Galaxy analysis and other features on the public server
>>> at usegalaxy.org.  Please keep all replies on the list by
>>> using "reply all" in your mail client.  For discussion of
>>> local Galaxy instances and the Galaxy source code, please
>>> use the Galaxy Development list:
>>> 
>>>  http://lists.bx.psu.edu/listinfo/galaxy-dev
>>> 
>>> To manage your subscriptions to this and other Galaxy lists,
>>> please use the interface at:
>>> 
>>>  http://lists.bx.psu.edu/
>> 
>> -- 
>> Jennifer Jackson
>> http://usegalaxy.org
>> http://galaxyproject.org/Support
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> 
>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>> 
>> http://lists.bx.psu.edu/
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] miRNA NGS data processing

2011-08-09 Thread David Matthews

Hi,

Tophat may still be an option for you. You can filter out spliced reads by 
filtering column 6 (the CIGAR column) for reads that only map directly (i.e. 
c6=='56M' if you have a 56bp paired end read). But I agree with Jen that most 
likely it is a sort problem.


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 9 Aug 2011, at 07:27, yao chen wrote:

> Hi Mete,
> 
> I am not sure it is the "sort" problem. I find "cufflinks" in galaxy is 
> unstable. I have bam files from Tophat which I can run cufflinks a few days 
> agao. 
> 
> But these days when I run cufflinks with these bam files, the error shows. 
> Strangely, it can work some time. I don't know the reason.
> 
> ChenYao
> 
> 2011/8/9 Jennifer Jackson 
> Hi Mete,
> 
> This FAQ has a workflow for sorting a Bowtie (or any) SAM file for Cufflinks:
> http://main.g2.bx.psu.edu/u/jeremy/p/transcriptome-analysis-faq#faq2
> 
> Thanks!
> 
> Jen
> Galaxy team
> 
> 
> On 8/4/11 10:27 AM, Mete Civelek wrote:
> Hi,
> 
> I'm trying to get read counts or FPKM values for my miRNA NGS data on
> Galaxy. I have aligned the reads using Bowtie, but it appears that
> Cufflinks gives an error when run on the Bowtie alignments (This might
> have something to do with Bowtie's BAM file not being sorted). I know
> that Tophat alignments work well with Cufflinks, but I'm not sure if it
> would be possible to use Tophat for my data since miRNA don't have
> splice junctions. I've tried without success to parameterize Tophat to
> completely avoid assigning splice junctions (by setting the max intron
> length to 1). Is there a way I can get the Bowtie alignment to work with
> Cufflinks on Galaxy? Or perhaps there's a way I can parametrize Tophat
> as to get no splice junctions?
> 
> Thanks,
> 
> Mete
> 
> 
> 
> IMPORTANT WARNING: This email (and any attachments) is only intended for
> the use of the person or entity to which it is addressed, and may
> contain information that is privileged and confidential. You, the
> recipient, are obligated to maintain it in a safe, secure and
> confidential manner. Unauthorized redisclosure or failure to maintain
> confidentiality may subject you to federal and state penalties. If you
> are not the intended recipient, please immediately notify us by return
> email, and delete this message from your computer.
> 
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>   http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>   http://lists.bx.psu.edu/
> 
> -- 
> Jennifer Jackson
> http://usegalaxy.org
> http://galaxyproject.org/Support
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface a

[galaxy-user] Downloading large files from galaxy

2011-09-13 Thread David Matthews

Hi,

I seem to be having problems downloading large files from galaxy - the request 
times out at about 1GB and I'm downloading 2-3GB. Am I doing something wrong?

David


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] SNPeff tool?

2011-11-08 Thread David Matthews

Hi,

I've had a few email chats with the author of snpEff and the fly in the 
ointment from my perspective is getting the vcf files it needs through Galaxy. 
As I understand it there is no way currently of getting the BAM/SAM files into 
the right input format so snpEff can use it within a Galaxy setup. So, whatever 
you do you'll still need one or two command line steps. We have a copy of 
snpEff here at Bristol on our Galaxy and when we did that we then realised 
there was no Galaxy method (that we could think of) for getting the input file 
ready for snpEFF to use. This is a pity as its actually a very nice piece of 
software with a nice professional looking output.

Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk

On 8 Nov 2011, at 13:55, Dannon Baker wrote:

> Hi Laura,
> 
> While the SNPeff developers have made Galaxy wrappers available, this is not 
> a tool we currently have installed for use on the Galaxy server at 
> main.g2.bx.psu.edu.  Off the top of my head, I don't know of any other public 
> Galaxy servers that offer this tool, but if you have access to a local or 
> cloud galaxy server you could use the provided wrapper to install the tool 
> for use there.
> 
> Thanks!
> 
> -Dannon
> 
> 
> 
> On Nov 8, 2011, at 6:40 AM, Laura Elizabeth Spoor wrote:
> 
>> Hi,
>> 
>> I use the Galaxy server and was wondering how to use SNPeff tool? I have 
>> seen that it can be integrating with Galaxy on their website 
>> (http://snpeff.sourceforge.net/images/snpEff_galaxy.png) but cannot see it 
>> on the server? Is it something that can be run on the server?
>> 
>> Best Wishes,
>> 
>> Laura
>> 
>> -- 
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>> 
>> 
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> 
>> http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>> 
>> http://lists.bx.psu.edu/
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] SNPeff tool?

2011-11-08 Thread David Matthews

Hi,

Yes, I see that you can generate the VCF files that way but there is no 
seamless way of doing it entirely from within galaxy - i.e. you need to come 
out of galaxy at some point (or am I missing something?).


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 8 Nov 2011, at 14:39, Chorny, Ilya wrote:

> I got it working just fine on my local server. Could you expand on your vcf 
> issue? I generate the vcf using gatk.
> 
> Sent from my iPhone
> 
> On Nov 8, 2011, at 6:36 AM, "David Matthews" 
> mailto:d.a.matth...@bristol.ac.uk>> wrote:
> 
> Hi,
> 
> I've had a few email chats with the author of snpEff and the fly in the 
> ointment from my perspective is getting the vcf files it needs through 
> Galaxy. As I understand it there is no way currently of getting the BAM/SAM 
> files into the right input format so snpEff can use it within a Galaxy setup. 
> So, whatever you do you'll still need one or two command line steps. We have 
> a copy of snpEff here at Bristol on our Galaxy and when we did that we then 
> realised there was no Galaxy method (that we could think of) for getting the 
> input file ready for snpEFF to use. This is a pity as its actually a very 
> nice piece of software with a nice professional looking output.
> 
> Best Wishes,
> David.
> 
> __
> Dr David A. Matthews
> 
> Senior Lecturer in Virology
> Room E49
> Department of Cellular and Molecular Medicine,
> School of Medical Sciences
> University Walk,
> University of Bristol
> Bristol.
> BS8 1TD
> U.K.
> 
> Tel. +44 117 3312058
> Fax. +44 117 3312091
> 
> d.a.matth...@bristol.ac.uk<mailto:d.a.matth...@bristol.ac.uk>
> 
> 
> 
> 
> 
> 
> On 8 Nov 2011, at 13:55, Dannon Baker wrote:
> 
> Hi Laura,
> 
> While the SNPeff developers have made Galaxy wrappers available, this is not 
> a tool we currently have installed for use on the Galaxy server at 
> main.g2.bx.psu.edu.  Off the top of my head, I don't know of any other public 
> Galaxy servers that offer this tool, but if you have access to a local or 
> cloud galaxy server you could use the provided wrapper to install the tool 
> for use there.
> 
> Thanks!
> 
> -Dannon
> 
> 
> 
> On Nov 8, 2011, at 6:40 AM, Laura Elizabeth Spoor wrote:
> 
> Hi,
> 
> I use the Galaxy server and was wondering how to use SNPeff tool? I have seen 
> that it can be integrating with Galaxy on their website 
> (http://snpeff.sourceforge.net/images/snpEff_galaxy.png) but cannot see it on 
> the server? Is it something that can be run on the server?
> 
> Best Wishes,
> 
> Laura
> 
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> 
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org<http://usegalaxy.org>.  Please keep all replies on the list 
> by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
> http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
> http://lists.bx.psu.edu/
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org<http://usegalaxy.org>.  Please keep all replies on the list 
> by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
> http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
> http://lists.bx.psu.edu/
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org<http://usegalaxy.org>.  Please keep all replies on the list 
> by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
>

Re: [galaxy-user] selecting reads at random from fastq file

2011-11-09 Thread David Matthews

Hi,

This may be a bit dumb or missing the point but just selecting the first 5 
million is kind of random isn't it? I mean where the reads map and what they 
are from is not known to you and they were not collected by the sequencer in a 
manner that is influenced by the nature of the sample?


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 9 Nov 2011, at 09:44, Hans-Rudolf Hotz wrote:

> Hi Paul, Hi Peter
> 
> You might also wanna look at the 'FastqSampler' function in the Bioconductor 
> 'ShortRead' package
> http://bioconductor.org/packages/release/bioc/html/ShortRead.html
> 
> We are working (as part of our NGS pipeline redesign) on adding more 
> Bioconductor functionalities to Galaxy. Unfortunately, it is very low on my 
> pile of stuff to do, so it will take a while till it appears in the 'Tool 
> Shed'.
> 
> 
> Regards, Hans
> 
> 
> 
> On 11/08/2011 11:45 PM, Peter Cock wrote:
>> On Tue, Nov 8, 2011 at 10:26 PM, Austin Paul  wrote:
>>> Hi Peter,
>>> 
>>> Thanks for the suggestion.  For example, I have a fastq file with 50 million
>>> reads and I want to randomly select 5 million of them. It seems biopython
>>> would very easily select a single or a handful of reads with the
>>> Bio.SeqIO.index() function.  Would it also be able to do the job I am
>>> interested in?
>>> 
>>> Austin
>> 
>> I think so, but you'd have to use Bio.SeqIO.index_db() which stores
>> the index in an SQLite dictionary rather than in memory which isn't
>> really viable here (unless you have a 64bit big memory machine?).
>> I don't think I've tried it with quite that many reads though...
>> 
>> Alternatively, if I understood her correctly, Jennifer pointed out you
>> can do this in Galaxy but it will take a lot of IO:
>> 
>> 1. Convert FASTQ to tabular (4 lines per record ->  1 line per record)
>> 2. Randomly select lines (each line is now a record so safe)
>> 3. Convert tabular back to FASTQ
>> 
>> It should work though, and requires no additional programming.
>> 
>> Peter
>> 
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> 
>>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>> 
>>   http://lists.bx.psu.edu/
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
> http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
> http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] public interface issues

2011-12-05 Thread David Matthews

Hi,

Sorry to add to the list of problems but I can't see "my history" at all, I can 
only see the main pane and the list of tools.


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






On 5 Dec 2011, at 15:46, Mike Dufault wrote:

> Same here;  I can't initiate "Generate Pileup" from Sam tools. When I select 
> "Execute" the brower just stalls and does not add to my workflow. Also, I 
> can't download bam or bai files when use right-click "save targe as." I have 
> tried Internet Explorer and Firefox. 
> 
> --- On Mon, 12/5/11, Richard Mark White  wrote:
> 
> From: Richard Mark White 
> Subject: Re: [galaxy-user] public interface issues
> To: "Cittaro Davide" , "galaxy-u...@bx.psu.edu" 
> 
> Date: Monday, December 5, 2011, 9:17 AM
> 
> Likewise. I can get to my "Saved Histories", but when i click on one, very 
> few items (if any) show up in the rightside panel. ive also tried multiple 
> browsers, etc.
> rich
> 
> 
> From: Cittaro Davide 
> To: "galaxy-u...@bx.psu.edu"  
> Sent: Monday, December 5, 2011 7:39 AM
> Subject: [galaxy-user] public interface issues
> 
> Hi all, I can't use the public interface in an effective way: items in 
> history (i.e. the green boxes)  cannot be expanded. More important, names of 
> the items remind me something that has to deal with the tool/library path 
> (e.g. 2010_03/pilot2/README_pilot2_snps).
> This happens on OS X and MS Windows systems, Firefox and Safari.
> Thanks
> 
> d
> /*
> Davide Cittaro, PhD
> 
> Head of Bioinformatics Core
> Center for Translational Genomics and Bioinformatics
> San Raffaele Scientific Institute
> Via Olgettina 58
> 20132 Milano
> Italy
> 
> Office: +39 02 26439140
> Mail: cittaro.dav...@hsr.it
> Skype: daweonline
> */
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>   http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>   http://lists.bx.psu.edu/
> 
> 
> -Inline Attachment Follows-
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>   http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>   http://lists.bx.psu.edu/
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] suggestions for de novo assembly plant transcriptome without reference

2011-12-21 Thread David Matthews

Hi Jane,

I have used Trinity on a local installation here at Bristol University. The 
main reason its not on Galaxy main is because its very very memory intensive 
(we run it on nodes with 256GB RAM). So you really need access to a big machine 
to run it. Having said all that the output is astoundingly good so it's worth 
the time and effort to get a run going if you can.

Cheers
David

On 21 Dec 2011, at 13:36, Jane Song wrote:

> Dear Galaxy Expert,
> 
> I would like to use Galaxy to de-novo assembly single-end read illumina data 
> (140bp) for plant transcriptomes (without reference).  I remember early 
> emails mention trinity in Galaxy. But I could not see at Galaxy web 
> http://main.g2.bx.psu.edu/root
> . Maybe it is installed in Amarzon EC2? Other suggestions in de-novo assembly 
> plant transcriptomes without reference.
> 
> Many thanks and look forward to hearing back from you,
> Jane
> 
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] BLAST+ on the test site

2012-01-04 Thread David Matthews

Hi,

I;m wanting to run BLASTp on the test site to compare it to some test runs here 
on our local copy but the tool does not run saying "Index file named 
'blastdb_p.loc' is required by tool but not available". Is this me doing 
something wrong or is it something missing at the test site?


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Make a vcf file

2012-02-14 Thread David Matthews

Hi,

This may be a dense question, but how do we generate a vcf file from the public 
version of Galaxy? Am I missing something obvious?


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Tophat error

2012-03-14 Thread David Matthews


Hi,

JUst running a TopHat job which returned the following error:

Executing: /gpfs/cluster/isys/galaxy/Software/bin/bowtie-inspect 
/local/tmp5Ywx45/dataset_942 > ./tophat_out/tmp/dataset_942.fa
[Tue Mar 13 12:45:08 2012] Checking for Bowtie
Bowtie version:  0.12.7.0
[Tue Mar 13 12:45:08 2012] Checking for Samtools
Samtools Version: 0.1.18
[Tue Mar 13 12:45:08 2012] Generating SAM header for 
/local/tmp5Ywx45/dataset_942
format:  fastq
quality scale:   phred33 (default)
[Tue Mar 13 12:45:21 2012] Preparing reads
left reads: min. length=56, count=29523921
right reads: min. length=56, count=29543412
[Tue Mar 13 13:07:54 2012] Mapping left_kept_reads against dataset_942 with 
Bowtie 
[Tue Mar 13 13:45:26 2012] Processing bowtie hits
[Tue Mar 13 14:11:28 2012] Mapping left_kept_reads_seg1 against dataset_942 
with Bowtie (1/2)
[Tue Mar 13 14:43:27 2012] Mapping left_kept_reads_seg2 against dataset_942 
with Bowtie (2/2)
[Tue Mar 13 14:57:50 2012] Mapping right_kept_reads against dataset_942 with 
Bowtie 
[Tue Mar 13 15:37:46 2012] Processing bowtie hits
[Tue Mar 13 16:04:28 2012] Mapping right_kept_reads_seg1 against dataset_942 
with Bowtie (1/2)
[Tue Mar 13 16:37:18 2012] Mapping right_kept_reads_seg2 against dataset_942 
with Bowtie (2/2)
[Tue Mar 13 16:50:40 2012] Searching for junctions via segment mapping
Traceback (most recent call last):
  File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3063, in 
sys.exit(main())
  File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3029, in main
user_supplied_deletions)
  File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 2681, in 
spliced_alignment
[maps[initial_reads[left_reads]].unspliced_bwt, 
maps[initial_reads[left_reads]].seg_maps[-1]],
TypeError: list indices must be integers, not str
Does anyone know what this kind of error is?

Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Tophat error

2012-03-14 Thread David Matthews

Hi,

JUst running a TopHat job which returned the following error:

Executing: /gpfs/cluster/isys/galaxy/Software/bin/bowtie-inspect 
/local/tmp5Ywx45/dataset_942 > ./tophat_out/tmp/dataset_942.fa
[Tue Mar 13 12:45:08 2012] Checking for Bowtie
Bowtie version:  0.12.7.0
[Tue Mar 13 12:45:08 2012] Checking for Samtools
Samtools Version: 0.1.18
[Tue Mar 13 12:45:08 2012] Generating SAM header for 
/local/tmp5Ywx45/dataset_942
format:  fastq
quality scale:   phred33 (default)
[Tue Mar 13 12:45:21 2012] Preparing reads
left reads: min. length=56, count=29523921
right reads: min. length=56, count=29543412
[Tue Mar 13 13:07:54 2012] Mapping left_kept_reads against dataset_942 with 
Bowtie 
[Tue Mar 13 13:45:26 2012] Processing bowtie hits
[Tue Mar 13 14:11:28 2012] Mapping left_kept_reads_seg1 against dataset_942 
with Bowtie (1/2)
[Tue Mar 13 14:43:27 2012] Mapping left_kept_reads_seg2 against dataset_942 
with Bowtie (2/2)
[Tue Mar 13 14:57:50 2012] Mapping right_kept_reads against dataset_942 with 
Bowtie 
[Tue Mar 13 15:37:46 2012] Processing bowtie hits
[Tue Mar 13 16:04:28 2012] Mapping right_kept_reads_seg1 against dataset_942 
with Bowtie (1/2)
[Tue Mar 13 16:37:18 2012] Mapping right_kept_reads_seg2 against dataset_942 
with Bowtie (2/2)
[Tue Mar 13 16:50:40 2012] Searching for junctions via segment mapping
Traceback (most recent call last):
  File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3063, in 
sys.exit(main())
  File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3029, in main
user_supplied_deletions)
  File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 2681, in 
spliced_alignment
[maps[initial_reads[left_reads]].unspliced_bwt, 
maps[initial_reads[left_reads]].seg_maps[-1]],
TypeError: list indices must be integers, not str
Does anyone know what this kind of error is?


Best Wishes,
David.

__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

[galaxy-user] Tophat error

2012-03-14 Thread David Matthews

Hi,

JUst running a TopHat job which returned the following error:

Executing: /gpfs/cluster/isys/galaxy/Software/bin/bowtie-inspect 
/local/tmp5Ywx45/dataset_942 > ./tophat_out/tmp/dataset_942.fa
[Tue Mar 13 12:45:08 2012] Checking for Bowtie
Bowtie version:  0.12.7.0
[Tue Mar 13 12:45:08 2012] Checking for Samtools
Samtools Version: 0.1.18
[Tue Mar 13 12:45:08 2012] Generating SAM header for 
/local/tmp5Ywx45/dataset_942
format:  fastq
quality scale:   phred33 (default)
[Tue Mar 13 12:45:21 2012] Preparing reads
left reads: min. length=56, count=29523921
right reads: min. length=56, count=29543412
[Tue Mar 13 13:07:54 2012] Mapping left_kept_reads against dataset_942 with 
Bowtie 
[Tue Mar 13 13:45:26 2012] Processing bowtie hits
[Tue Mar 13 14:11:28 2012] Mapping left_kept_reads_seg1 against dataset_942 
with Bowtie (1/2)
[Tue Mar 13 14:43:27 2012] Mapping left_kept_reads_seg2 against dataset_942 
with Bowtie (2/2)
[Tue Mar 13 14:57:50 2012] Mapping right_kept_reads against dataset_942 with 
Bowtie 
[Tue Mar 13 15:37:46 2012] Processing bowtie hits
[Tue Mar 13 16:04:28 2012] Mapping right_kept_reads_seg1 against dataset_942 
with Bowtie (1/2)
[Tue Mar 13 16:37:18 2012] Mapping right_kept_reads_seg2 against dataset_942 
with Bowtie (2/2)
[Tue Mar 13 16:50:40 2012] Searching for junctions via segment mapping
Traceback (most recent call last):
  File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3063, in 
sys.exit(main())
  File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3029, in main
user_supplied_deletions)
  File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 2681, in 
spliced_alignment
[maps[initial_reads[left_reads]].unspliced_bwt, 
maps[initial_reads[left_reads]].seg_maps[-1]],
TypeError: list indices must be integers, not str
Does anyone know what this kind of error is?

Best Wishes,
David.



__
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk






___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Tophat error

2012-03-14 Thread David Matthews

Hi,

Thanks for the reply, sorry about the multiple posts - it kept getting bounced 
so I resubmitted the question. We seem to be having real problems with our 
local install so I'll add it to the list...!

Cheers
David



On 14 Mar 2012, at 18:10, Jennifer Jackson wrote:

> Hi David,
> 
> You question has posted to the list now and we will be getting back to you. 
> It didn't post immediately due to some mail mailman server issues here.
> 
> This looks like a problem that came up on a local instance. Because of that, 
> I am going to send this over to the galaxy-...@bx.psu.edu mailing list. At 
> first glance, this appears to be a problem with the NGS genome indexes used 
> for the target genome. These are the instructions you followed?
> http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup (Bowtie indexes are used 
> for TopHat)
> 
> We will be looking at this more later today, but I wanted to get back to you, 
> so you that you know that this doesn't need to be posted again.
> 
> Thanks!
> 
> Jen
> Galaxy team
> 
> On 3/14/12 6:48 AM, David Matthews wrote:
>> Hi,
>> 
>> JUst running a TopHat job which returned the following error:
>> 
>> Executing: /gpfs/cluster/isys/galaxy/Software/bin/bowtie-inspect 
>> /local/tmp5Ywx45/dataset_942>  ./tophat_out/tmp/dataset_942.fa
>> [Tue Mar 13 12:45:08 2012] Checking for Bowtie
>>  Bowtie version:  0.12.7.0
>> [Tue Mar 13 12:45:08 2012] Checking for Samtools
>>  Samtools Version: 0.1.18
>> [Tue Mar 13 12:45:08 2012] Generating SAM header for 
>> /local/tmp5Ywx45/dataset_942
>>  format:  fastq
>>  quality scale:   phred33 (default)
>> [Tue Mar 13 12:45:21 2012] Preparing reads
>>  left reads: min. length=56, count=29523921
>>  right reads: min. length=56, count=29543412
>> [Tue Mar 13 13:07:54 2012] Mapping left_kept_reads against dataset_942 with 
>> Bowtie
>> [Tue Mar 13 13:45:26 2012] Processing bowtie hits
>> [Tue Mar 13 14:11:28 2012] Mapping left_kept_reads_seg1 against dataset_942 
>> with Bowtie (1/2)
>> [Tue Mar 13 14:43:27 2012] Mapping left_kept_reads_seg2 against dataset_942 
>> with Bowtie (2/2)
>> [Tue Mar 13 14:57:50 2012] Mapping right_kept_reads against dataset_942 with 
>> Bowtie
>> [Tue Mar 13 15:37:46 2012] Processing bowtie hits
>> [Tue Mar 13 16:04:28 2012] Mapping right_kept_reads_seg1 against dataset_942 
>> with Bowtie (1/2)
>> [Tue Mar 13 16:37:18 2012] Mapping right_kept_reads_seg2 against dataset_942 
>> with Bowtie (2/2)
>> [Tue Mar 13 16:50:40 2012] Searching for junctions via segment mapping
>> Traceback (most recent call last):
>>   File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3063, in
>> sys.exit(main())
>>   File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3029, in main
>> user_supplied_deletions)
>>   File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 2681, in 
>> spliced_alignment
>> [maps[initial_reads[left_reads]].unspliced_bwt, 
>> maps[initial_reads[left_reads]].seg_maps[-1]],
>> TypeError: list indices must be integers, not str
>> 
>> Does anyone know what this kind of error is?
>> 
>> Best Wishes,
>> David.
>> 
>> 
>> 
>> __
>> Dr David A. Matthews
>> 
>> Senior Lecturer in Virology
>> Room E49
>> Department of Cellular and Molecular Medicine,
>> School of Medical Sciences
>> University Walk,
>> University of Bristol
>> Bristol.
>> BS8 1TD
>> U.K.
>> 
>> Tel. +44 117 3312058
>> Fax. +44 117 3312091
>> 
>> d.a.matth...@bristol.ac.uk <mailto:d.a.matth...@bristol.ac.uk>
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> ___
>> The Galaxy User list should be used for the discussion of
>> Galaxy analysis and other features on the public server
>> at usegalaxy.org.  Please keep all replies on the list by
>> using "reply all" in your mail client.  For discussion of
>> local Galaxy instances and the Galaxy source code, please
>> use the Galaxy Development list:
>> 
>>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>> 
>> To manage your subscriptions to this and other Galaxy lists,
>> please use the interface at:
>> 
>>   http://lists.bx.psu.edu/


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] [galaxy-dev] Tophat error

2012-03-20 Thread David Matthews

Hi Jen,
Many thanks for this, much appreciated. Will give it a try - fingers crossed!
David.

Sent from my phone, so please excuse my mistakes.

-Original Message-
From: Jennifer Jackson
Sent: 20/03/2012 18:32
To: David Matthews; Galaxy Dev
Cc: galaxy-user; Jeremy Goecks
Subject: Re: [galaxy-dev] [galaxy-user] Tophat error

Hi David,

I don't know if you are still having this problem or not, but I did a 
web search and found this thread on seqanswers from 2/16 that seems like 
a good match to the problem you were having:

http://seqanswers.com/forums/showthread.php?p=65085

These scientists resolved the problem by removing the "--closure-search" 
option from the command string.

On the Galaxy tool form, this is the option "Use Closure Search:". which 
is "No" by default. Perhaps you set this to be "Yes"? I would try 
switching it to "No" to see if that solves the problem.

If not, then contacting the tool authors would probably be the best next 
step, either at seqanswers or directly at tophat.cuffli...@gmail.com. 
The original guess about genome indexes was way off base, this is a 
python error statement. I don't believe this to be related to the Galaxy 
wrapper but will cc Jeremy for a second opinion.

Hopefully the first option will resolve the issue!

Thanks,

Jen
Galaxy team

On 3/14/12 11:10 AM, Jennifer Jackson wrote:
> Hi David,
>
> You question has posted to the list now and we will be getting back to
> you. It didn't post immediately due to some mail mailman server issues
> here.
>
> This looks like a problem that came up on a local instance. Because of
> that, I am going to send this over to the galaxy-...@bx.psu.edu mailing
> list. At first glance, this appears to be a problem with the NGS genome
> indexes used for the target genome. These are the instructions you
> followed?
> http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup (Bowtie indexes are
> used for TopHat)
>
> We will be looking at this more later today, but I wanted to get back to
> you, so you that you know that this doesn't need to be posted again.
>
> Thanks!
>
> Jen
> Galaxy team
>
> On 3/14/12 6:48 AM, David Matthews wrote:
>> Hi,
>>
>> JUst running a TopHat job which returned the following error:
>>
>> Executing: /gpfs/cluster/isys/galaxy/Software/bin/bowtie-inspect
>> /local/tmp5Ywx45/dataset_942> ./tophat_out/tmp/dataset_942.fa
>> [Tue Mar 13 12:45:08 2012] Checking for Bowtie
>> Bowtie version: 0.12.7.0
>> [Tue Mar 13 12:45:08 2012] Checking for Samtools
>> Samtools Version: 0.1.18
>> [Tue Mar 13 12:45:08 2012] Generating SAM header for
>> /local/tmp5Ywx45/dataset_942
>> format: fastq
>> quality scale: phred33 (default)
>> [Tue Mar 13 12:45:21 2012] Preparing reads
>> left reads: min. length=56, count=29523921
>> right reads: min. length=56, count=29543412
>> [Tue Mar 13 13:07:54 2012] Mapping left_kept_reads against dataset_942
>> with Bowtie
>> [Tue Mar 13 13:45:26 2012] Processing bowtie hits
>> [Tue Mar 13 14:11:28 2012] Mapping left_kept_reads_seg1 against
>> dataset_942 with Bowtie (1/2)
>> [Tue Mar 13 14:43:27 2012] Mapping left_kept_reads_seg2 against
>> dataset_942 with Bowtie (2/2)
>> [Tue Mar 13 14:57:50 2012] Mapping right_kept_reads against
>> dataset_942 with Bowtie
>> [Tue Mar 13 15:37:46 2012] Processing bowtie hits
>> [Tue Mar 13 16:04:28 2012] Mapping right_kept_reads_seg1 against
>> dataset_942 with Bowtie (1/2)
>> [Tue Mar 13 16:37:18 2012] Mapping right_kept_reads_seg2 against
>> dataset_942 with Bowtie (2/2)
>> [Tue Mar 13 16:50:40 2012] Searching for junctions via segment mapping
>> Traceback (most recent call last):
>> File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3063,
>> in
>> sys.exit(main())
>> File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 3029, in main
>> user_supplied_deletions)
>> File "/gpfs/cluster/isys/galaxy/Software/bin/tophat", line 2681, in
>> spliced_alignment
>> [maps[initial_reads[left_reads]].unspliced_bwt,
>> maps[initial_reads[left_reads]].seg_maps[-1]],
>> TypeError: list indices must be integers, not str
>>
>> Does anyone know what this kind of error is?
>>
>> Best Wishes,
>> David.
>>
>>
>>
>> __
>> Dr David A. Matthews
>>
>> Senior Lecturer in Virology
>> Room E49
>> Department of Cellular and Molecular Medicine,
>> School of Medical Sciences
>> University Walk,
>> University of Bristol
>> Bristol.
>> BS8 1TD
>> U.K.
>>
>> Tel. +44 117 3312058
>> Fax.

[galaxy-user] ftp not working

2014-03-26 Thread David Matthews

Hi,

I'm still not able to use ftp - keeps saying connection refused either through 
the command line or through cyberduck - is anyone else seeing this?

Best Wishes,
David.

_
Dr David A. Matthews

Senior Lecturer in Virology
Room E49
Department of Cellular and Molecular Medicine,
School of Medical Sciences
University Walk,
University of Bristol
Bristol.
BS8 1TD
U.K.

Tel. +44 117 3312058
Fax. +44 117 3312091

d.a.matth...@bristol.ac.uk





___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

54 matches

Mail list logo