Hi Jill,
I am pretty certain that I found out why mm7 is not extracting - the
database is not fully set up to use with this tool (although the data is
present). I'll add this to the list of items to adjust this upcoming
month (plus find/fix any others like it - all would be older DBs).
And glad the tab file is now working. Whenever you really do have just a
tabular file - using a plain text editor is best along with the option
on the 'Get Data -> Upload File' form of ' Convert spaces to tabs:'.
Excel is known to most bioinformatics folks as a tool that it is wise to
carefully screen any "text" output from - primarily because of inserted
'hidden' or whitespace characters (soft returns and such). Not Excel's
fault, nor any other editor's - but what you did (cycle through a plain
text editor) is one way gain clear data.
Now, that said -> never use that upload option on any file that would
contain internal spaces - such as GFF/GTF, or SAM, but for plain text
tabular, in particular strict BED, this can help clean up stray spaces
or tabs introduced. Other tools in Text manipulation can also help for
data already loaded (try cutting out the columns you want to use, maybe
after converting all whitespace to tabs first).
Thanks and glad you have a working solution. I missed the details of the
mm7 extract issue originally - sorry if that was confusing!
Jen
Galaxy team
On 10/31/13 6:46 AM, Kreiling, Jill wrote:
Thank you Jen. You mentioned it may be a formatting problem and you
were able to successfully convert the coordinates to mm8. I tried
that several times yesterday and they kept coming up in the unmapped
file saying the region was deleted from the newer build. I opened the
tab deliminated text file I created in Excel in Notepad++ and just
resaved it without changing anything. When I uploaded the new file to
galaxy and and lifted over to mm8 it worked fine. It still wouldn't
pull out genomic sequences from mm7, but it will from the new file
converted to mm8. Thank you for your help - it is very much appreciated!
Jill
On Wed, Oct 30, 2013 at 11:45 PM, Jennifer Jackson <j...@bx.psu.edu
<mailto:j...@bx.psu.edu>> wrote:
Hello Jill,
This is strange. I just pasted the region you noted below into
Galaxy (in the 'Get Data -> Upload File' tool), assigned it to
mm7, and lifted to mm8 without any issues. I also checked the data
behind the tool - all appears to be fine.
result in mm8 coordinates
chr1 4552557 4556399 region_0 0 +
Are you certain there is not a format problem with the data? This
seems to be the only explanation for the problem. But after one
more check, you can submit a bug report and note that this is the
problem. Be sure to leave the input and all error outputs
undeleted when you report the problem or we won't be able to offer
the best feedback.
It is true that UCSC only produced a liftOver file that went from
mm7->mm6/8, then you can go from mm8->mm7/9/10. This is just the
data available. When lifting from data this old - be aware that a
genome can change quite a bit in some regions in new 3 revisions.
Still, lifting this way is certainly something you can try. If a
much older genome is not in Galaxy, just do the lift at UCSC (the
liftOver tool is under the top blue banner "Tools").
Hopefully the problem can be sorted out but if not we can take a look,
Jen
Galaxy team
On 10/30/13 3:04 PM, Kreiling, Jill wrote:
Hello, I have a set of coordinates for mm7 that I have been
using try to extract the genomic sequences. However it doesn't
recognize the chromosome name column. The are currently listed
as chr1, chr2, ....chrX. This is the error I get each time I try
to extract sequences:
Chromosome by name 'chr1' was not found for build 'mm7'. Skipped
1181 invalid lines, 1st is #1, "chr1 4558068 4561910 region_0 0 +"
However if I change the build to mm10 it works fine - but the
coordinates are not the same between builds. Also, mm7 can't be
lifted over to mm9 or mm10.
Does anyone know the proper format for chromosome name in mm7:
Thanks,
Jill
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
atusegalaxy.org <http://usegalaxy.org>. Please keep all replies on the
list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/
--
Jennifer Hillman-Jackson
http://galaxyproject.org
--
Jill Kreiling, Ph.D.
Assistant Professor, Research
Department of Molecular Biology, Cell Biology and Biochemistry
Brown University
Providence, RI 02903
--
Jennifer Hillman-Jackson
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/