Hi All,
I downloaded some RNA-seq datasets from NCBI. The datasets were generated by
Illumina Hiseq 2000. I am not sure which Input FASTQ quality scores type I
should choose when run FASTQ Groomer. Below shows the scores of 2 reads of a
dataset, I renamed them as read 1 and read 2.
1)
Hi All,
After I finshed Tophat alignment for RNA-seq, I took look at the details of
parameters by clicking the icon View details, and I got the information as
shown below:
Input Parameter Value Note for rerun
RNA-Seq FASTQ file 73: Filtered Groomed data1_rep2
Use a built in reference
Hi All,
I have a very basic question about parameters for running TopHat.
I have datasets of single-end reads. These datasets were generated with
Illumina Genome Analyzer IIx. Which Library Type should I choose to run
Tophat?
Thanks.
Best,
Jianguang
?
Best,
Jianguang
From: Jeremy Goecks [jeremy.goe...@emory.edu]
Sent: Wednesday, April 10, 2013 3:16 PM
To: Du, Jianguang
Cc: galaxy-user@lists.bx.psu.edu
Subject: Re: [galaxy-user] Are reads of 36nt in length long enough to accutatly
map on splicing junctions
Hi Jen,
Thanks for the information. I used this setting and the merged BAM files
(.accepted hits) worked quite well for the downstream analysis.
Best,
Jianguang
From: Jennifer Jackson [j...@bx.psu.edu]
Sent: Tuesday, April 09, 2013 4:10 PM
To: Du, Jianguang
Cc
be 33 nucleotides). So my understanding is that setting the Anchor
length at 3 does not increase the inaccuracy of the alignment. Am I correct?
Best,
Jianguang
From: Jeremy Goecks [jeremy.goe...@emory.edu]
Sent: Tuesday, April 09, 2013 1:57 PM
To: Du, Jianguang
in the .splicing junctions output. Is my
understanding correct? Does the regions mean the number of mapped splicing
junctions?
Thanks.
Best,
Jianguang
From: Jeremy Goecks [jeremy.goe...@emory.edu]
Sent: Tuesday, April 09, 2013 9:03 AM
To: Du, Jianguang
Cc
Hi All,
I have a very basic question. I have RNA-seq datasets of several cell types and
want to compare the alternative splicing events between cell types. The reads
are 36nt in length. Are these reads long enough to map on the splicing
jucntions accurately when I run Tophat with stringent
Hi All,
I want to merge the Tophat output (Accepted Hits) of Several datasets. I want
the merged BAM file has the exact format as the individual input BAM files,
should I check Merge all component bam file headers into the merged bam file?
Thanks.
Have a nice weekend.
Jianguang
Hi All,
Is there a size limit of dataset for running Tophat at Galaxy? If there is, how
many reads is the limit?
Thanks.
Jianguang
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the
Hi Everyone,
When I upload my datasets onto my history via FTP method (using FileZilla), do
I need to specify the file format under File Format of Upload File from your
computer?
I noticed that the screencast of how to upload datasets via FTP just leaves the
File Format as Auto-detect.
Dear Sir or Madam,
I had onpened multiple accounts at Galaxy Main, I did not know that it is
against policy. I noticed this policy when I found that all the accounts are
blocked. Would you please restore the account with email address
jia...@iupui.edumailto:jia...@iupui.edu?
If you are not
Dear All,
I am comparing the gene expression between two cell types by examining the
Cufflink output file -- gene differential expression
testingjavascript:void(0);. The file lists the FPKM of genes in two cell
types and log2 of fold. I want to look for genes that have more than 2-flod of
Dear All,
I want to use the Tophat output files with .accepted hits to do analysis
outside Galaxy. However, the program I am using requires the Tophat output to
be indexed, sorted BAM files that contain headers. Do the Tophat ouputs with
.accepted hits produced at Galaxy contain headers? Will
Dear All,
I am not so sure about two Tophat settings. Please help.
1) Number of mismatches allowed in the initial read mapping
Based on the documantation, my understanding is: the reads are re-aligned to
transcriptome/genome if the mismatches in the initial alignment is more than
the set
, and then compare between conditions.
Thanks in advance,
Jianguang
From: Jennifer Jackson [j...@bx.psu.edu]
Sent: Thursday, September 06, 2012 12:38 PM
To: Du, Jianguang
Cc: galaxy-user@lists.bx.psu.edu; closetic...@galaxyproject.org
Subject: Re: [galaxy
Dear All,
I tested how to set the Number of mismatches allowed in the initial read
mapping as follows.
At first, I ran FASTQ Groomer on a dataset to get the number of total reads.
The total number of the reads is 17510227.
Then I ran Tophat after set Number of mismatches allowed in the
Dear All,
I am looking for the differential splicing events between cell types. However
the Cuffdiff gives output using the square root of Jensen-shannon divergence
to measure the difference.
Although I tried my best to understand the definition of the square root of
Jensen-shannon
Dear All,
I have two more questions about settings for Tophat.
My aim is to look for the defferential splicing events between cell types.
After I checked Use Own Junctions, three more options came out:
1) Use Gene Annotation Model
2) Use raw Junctions
3) Only look for supplied junctions
Dear All,
I ran Flagstat under NGS: SAM Tools to check the quality of the Tophat
output (the file of accepted hits). I got the diagnosis results as follow:
9471730 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 duplicates
9471730 + 0 mapped (100.00%:-nan%)
0 + 0 paired in sequencing
0
Dear All,
I am looking for the differential splicing events between cell types.
Although I got a lot of helps from Jen and from protocols found online, I am
still not sure about some settings for Cufflink, Cuffmerge and Cuffdiff.
1) For Cufflink:
There is a setting for Bias Correction. I made
Dear All,
I am looking for the deferential splicing events between cell types. I have run
the Cuffdiff and I am going through the output file splicing differential
expression testing. I have read the documentation and protocols about how
Cuffdiff test for differential expression and
Dear All,
I am analysing RNA-seq datasets for the differential splicing events between
cell types. My reads are 36bp long. In order to increase the quality of reads,
I need to trim some nucleotides from ends. How many nucleotides can I trim? I
am afraid that if I trim too much, the reliability
Dear All,
I am analysing RNA-seq datasets for differential splicing events between cell
types.
Some of my reads contain bed nucleotides, should I run Filter FASTQ to remove
these not so good reads? If I do need to, what is the Minimum Quality
should I set for the Filter?
Thanks.
Jianguang
Dear All,
I am analysing RNA-seq datasets for differential splicing events between cell
types. These are mouse cells. Jen suggested me to use the iGenomes version of
reference GTF to take full advantage of the options in CuffDiff. My question
is: should I use this iGenome version reference GTF
From: Jennifer Jackson [j...@bx.psu.edu]
Sent: Thursday, August 23, 2012 11:46 AM
To: Du, Jianguang
Cc: galaxy-user@lists.bx.psu.edu
Subject: Re: [galaxy-user] Should I use iGenomes verson of a reference GTF for
Tophat?
Hello Jianguang,
When in the analysis
Use a built-in index.
How can I solve this problem?
Thanks in advance.
Jianguang
From: galaxy-user-boun...@lists.bx.psu.edu
[galaxy-user-boun...@lists.bx.psu.edu] on behalf of Du, Jianguang
[jia...@iupui.edu]
Sent: Thursday, August 23, 2012 4:01 PM
Dear All,
I have run programs from Tophat to Cuffdiff of Galaxy to look for the
difference in alternative splicing events between cell types. However I do not
know how to find the detail information (such as the sequence and the genomic
coordinates) of the alternatively spliced part of a
Dear All,
I am going to run Tophat with RNA-seq dataset to observe alternative splicing
events. There is a parameter for Tophat: Minimum length of read segment.
According to implemented Tophat options, the description for Minimum length
of read segment is Each read is cut up into segments,
Dear All,
In order to figure out the Mean Inner Distance between Mate Pairs of my
paired-end RNA-seq datasets, I ran Bowtie (Map with Bowtie for Illumina) with
both forward and reverse datasets and mouse mm9 as reference genome. Below I
list the Bowtie output for only one pair of reads (I put
Dear All,
I am analyzing the downloaded RNA-seq datasets. However I am not sure how much
is Mean Inner Distance between Mate Pairs for these paired-end datasets.
Take a paired-end RNA-seq dataset as an example, there is a description for
this dataset in SRA database of NCBI: Layout: PAIRED,
Dear All,
I want to compare the pre-mRNA alternaive splicing events between RNA-seq
datasets. Do I need to allow indel search when I run Tophat? What is the indel
search for? I could not find detail information about indel search through
the documentation of Tophat.
Thanks.
Jianguang Du
Dear All,
I want to compare the pre-mRNA alternaive splicing events between RNA-seq
datasets. Should I use own junctions when I run Tophat? What does Own
Junctions mean?
Thanks.
Jianguang DU
___
The Galaxy User list should be used for the
Dear All,
I am going to search the alternative splicing events bentween datasets. I am
not sure about the settings of mouse reference genome (mm9) when I upload it
from UCSC Main.
Would you please tell me the settings for
1) group:
2) Track:
3) Table:
4) Output format:
Thanks.
I have problem to split a paired-end FASTQ dataset into two separate datasets.
In order to explain the problem clearly, I list the detail of what I did with
my dataset:
Step 1) My aim is to compare datasets for the differential alternative
splicing. I downloaded paired-end datasets at FASTQ
dataset into two datasets, how should I choose
the settings when I run Manipulte FASTQ?
Thanks.
Jianguang
/
On 8/10/12 7:21 AM, Du, Jianguang wrote:
I have problem to split a paired-end FASTQ dataset into two separate
datasets. In order to explain the problem clearly
dataset into two datasets, how should I choose
the settings when I run Manipulte FASTQ?
Thanks.
Jianguang
/
On 8/10/12 7:21 AM, Du, Jianguang wrote:
I have problem to split a paired-end FASTQ dataset into two separate
datasets. In order to explain the problem clearly
37 matches
Mail list logo