Hi Lilach,

Regarding the cloud instance, you can load data from the public main instance of Galaxy just like any other URL. On the "Get Data -> Upload Data" form on your cloud instance , paste in the URLs of the datasets from main. The URL can be captured by right-clicking on a dataset's disk icon and then "Copy link location" (on a Mac; do the equivalent if using a PC).

It is generally better to transfer one URL per job, if the data is large, since jobs have a certain amount of time to complete. If you lump together several large file URLs into one job, there could be a chance that it could time out. It is fine to execute several jobs concurrently.

Best,

Jen
Galaxy team

On 6/27/12 6:51 AM, Lilach Friedman wrote:
Hi Jennifer,
Is there a way to directly upload my files from the public Galaxy to my cloud Galaxy instance (in AWS)? Or should I download them first to my computer, and then to upload them? (It takes a lot of time because of the low uploading speed).

Thanks,
   Lilach

2012/6/26 Jennifer Jackson <j...@bx.psu.edu <mailto:j...@bx.psu.edu>>

    Hello Lilach,

    Currently, the human reference genome indexed for the GATK-beta
    tools is 'hg_g1k_v37'. The GATK-beta tools are under active
    revision by our team, so we expect there to be little to no change
    to the beta version on the main public instance until this is
    completed.

    Attempting to convert data between different builds is not
    recommended. These tools are very sensitive to exact inputs, which
    extends to naming conventions, etc. The best practice path is to
    start and continue an analysis project with the same exact genome
    build throughout.

    If you want to use the hg19 indexes provided by the GATK project,
    a cloud instance is the current option (using a hg19 genome as a
    'custom genome' will exceed the processing limits available on the
    public Galaxy instance). Following the links on the GATK tools can
    provide more information about sources, including links on the
    GATK web site which will note the exact contents of the both of
    these genome versions, downloads, and other resources.

    Hopefully this helps to clear up any confusion,

    Best,

    Jen
    Galaxy team


    On 6/21/12 7:50 AM, Lilach Friedman wrote:
    Hi Jennifer,
    Thank you for this reply.

    I made a new BWA file, this time using the hg19(full) genome.
    However, when I am trying to use DepthOfCoverage, the reference
    genomr is stucked on the hg_g1k_v37 (this is the only option to
    select), and I cannot change it to hg19(full). Most probably,
    because I selected hg_g1k_v37 in the previous time I tried to use
    DepthOfCoverage.
    It seems as a bug? How can I change it?

    Thanks,
      Lilach


    2012/6/18 Jennifer Jackson <j...@bx.psu.edu <mailto:j...@bx.psu.edu>>

        Hi Lilach,

        The problem with this analysis probably has to do with a
        mismatch between the genomes: the intervals obtained from
        UCSC (hg19) and the BAM from your BWA (hg_g1k_v37) run.

        UCSC does not contain the genome 'hg_g1k_v37' - the genome
        available from UCSC is 'hg19'.

        Even though these are technically the same human release, on
        a practical level, they have a different arrangement for some
        of the chromosomes. You can compare NBCI GRCh37
        <http://www.ncbi.nlm.nih.gov/genome/assembly/2758/>with UCSC
        hg19 <http://genome.ucsc.edu>for an explanation. Reference
        genomes must be /exact/ in order to be used with tools - base
        for base. When they are exact, the identifier will be exact
        between Galaxy and the source (UCSC, Ensembl) or the full
        Build name will provide enough information to make a
        connection to NCBI or other.

        Sometimes genomes are similar enough that a dataset sourced
        from one can be used with another, if the database attribute
        is changed and the data from the regions that differ is
        removed. This may be possible in your case, only trying will
        let you know how difficult it actually is with your analysis.
        The GATK pipeline is very sensitive to exact inputs. You will
        need to be careful with genome database assignments, etc.
        Following the links on the tool forms to the GATK help pages
        can provide some more detail about expected inputs, if this
        is something that you are going to try.

        Good luck with the re-run!

        Jen
        Galaxy team


        On 6/18/12 4:42 AM, Lilach Friedman wrote:
        Hi,
        I am trying to used Depth of Coverage to see the coverages
        is specific intervals.
        The intervals were taken from UCSC (exons of 2 genes),
        loaded to Galaxy and the file type was changed to intervals.

        I gave to Depth of Coverage two BAM files (resulted from
        BWA, selection of only raws with the Matching pattern:
        XT:A:U, and then SAM-to-BAM)
        and the intervals file (in advanced GATK options).
        The consensus genome is hg_g1k_v37.

        I got the following error message:

        An error occurred running this job: /Picked up
        _JAVA_OPTIONS: -Djava.io.tmpdir=/space/g2main
        ##### ERROR
        
------------------------------------------------------------------------------------------
        ##### ERROR A USER ERROR has occurred (version
        1.4-18-g80a4ce0):
        ##### ERROR The invalid argume


        /Is it a bug, or did I do anything wrong?

        I will be grateful for any help.

        Thanks!
           Lilach/
        /


        ___________________________________________________________
        The Galaxy User list should be used for the discussion of
        Galaxy analysis and other features on the public server
        atusegalaxy.org  <http://usegalaxy.org>.  Please keep all replies on 
the list by
        using "reply all" in your mail client.  For discussion of
        local Galaxy instances and the Galaxy source code, please
        use the Galaxy Development list:

           http://lists.bx.psu.edu/listinfo/galaxy-dev

        To manage your subscriptions to this and other Galaxy lists,
        please use the interface at:

           http://lists.bx.psu.edu/

-- Jennifer Jackson
        http://galaxyproject.org


-- Jennifer Jackson
    http://galaxyproject.org




--
Jennifer Jackson
http://galaxyproject.org



___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to