Re: Store query results in new RDF

Andy Seaborne Thu, 07 Nov 2013 05:58:06 -0800

On 07/11/13 02:55, Adeeb Noor wrote:


On Wed, Nov 6, 2013 at 5:23 AM, Andy Seaborne <a...@apache.org
<mailto:a...@apache.org>> wrote:

    On 06/11/13 00:31, Adeeb Noor wrote:

        Any help with my question please.

        AdeeB


        On Mon, Nov 4, 2013 at 1:48 PM, Adeeb Noor
        <adeeb.n...@colorado.edu <mailto:adeeb.n...@colorado.edu>> wrote:

            Hi Andy:

            Thanks for the response.

            My TDB is on my hard drive with 15GB size wise.


    How many triples?


*The number of triples is: 37397456*


    And how much of the DB does the SELECT query match?

    What's SELECT (count(*) AS ?c) ....
    What's SELECT (count(distinct *) AS ?c) ....


No response?

This is asking what proportion of the DB is being extracted.

On the current information, I'd guess the system starts swapping due toa large construct graph but that's just a guess.


Streaming the execConstructTriples to a disk file may help.



*Here is the SELECT I want to apply to generate my subgraph:*


Long, incomplete query.

Try reordering the FILTERs putting the || ones last. May make nodifference but without sizing figures, only you can know.


        Andy


CONSTRUCT

{

ddidd:C0004057 ?r ?disease1 .

?disease1 ?r10 ddidd:C0004057 .

?disease1 ?r1 ?omim1 .

?omim1 ?r11 ?disease1 .

?omim1 ?r2 ?w .

?w ?r12 ?omim1 .

?omim1 ?r3 ?bp .

?bp ?r13 ?omim1 .

?omim1 ?r4 ?genotypePhenotype .

?genotypePhenotype ?r14 ?omim1 .

?omim1 ?r5 ?gene.

?gene ?r15 ?omim1 .

?w ?r6 ?gene2.

?gene2 ?r16 ?w .

?omim1 ?r7 ?gene3 .

?gene3 ?r17 ?omim1.

?gene3 ?r8 ?bp2 .

?bp2 ?r18 ?gene3 .

?gene3 ?r9 ?genotypePhenotype2 .

?genotypePhenotype2 ?r19 ?gene3 .

?gene a ?gCLASS.

?gene2 a ?g2CLASS.

?gene3 a ?g3CLASS.

?genotypePhenotype a ?genotypePhenotypeCLASS .

?genotypePhenotype2 a ?genotypePhenotype2CLASS.

?w a ?wCLASS .

?omim1 a ?omimt1 .

?bp a ?bpCLASS .

?bp2 a ?bp2CLASS .

ddidd:C0004057 ddids:label ?ldrug1 .

?disease1 ddids:label ?ldisease1 .

?omim1 ddids:label ?lomim1 .

?w ddids:label ?lw .

?bp ddids:label ?lbp .

?genotypePhenotype ddids:label ?lgenotypePhenotype .

?gene ddids:label ?lgene .

?gene2 ddids:label ?lgene2 .

?gene3 ddids:label ?lgene3 .

?bp2 ddids:label ?lbp2 .

?genotypePhenotype2 ddids:label ?lgenotypePhenotype2 .

}WHERE{

ddidd:C0004057 ?r ?disease1 .

?disease1 ?r10 ddidd:C0004057 .

  ?disease1 ?r1 ?omim1 .

?omim1 ?r11 ?disease1 .

?omim1 ?r2 ?w .

?w ?r12 ?omim1 .

?omim1 ?r3 ?bp .

?bp ?r13 ?omim1 .

?omim1 ?r4 ?genotypePhenotype .

?genotypePhenotype ?r14 ?omim1 .

?omim1 ?r5 ?gene.

?gene ?r15 ?omim1 .

?w ?r6 ?gene2.

?gene2 ?r16 ?w .

?omim1 ?r7 ?gene3 .

?gene3 ?r17 ?omim1.

?gene3 ?r8 ?bp2 .

?bp2 ?r18 ?gene3 .

?gene3 ?r9 ?genotypePhenotype2 .

?genotypePhenotype2 ?r19 ?gene3 .

?gene a ?gCLASS.

?gene2 a ?g2CLASS.

?gene3 a ?g3CLASS.

?genotypePhenotype a ?genotypePhenotypeCLASS .

?genotypePhenotype2 a ?genotypePhenotype2CLASS.

?w a ?wCLASS .

?omim1 a ?omimt1 .

?bp a ?bpCLASS .

?bp2 a ?bp2CLASS .

ddidd:C0004057 ddids:label ?ldrug1 .

?disease1 ddids:label ?ldisease1 .

?omim1 ddids:label ?lomim1 .

?w ddids:label ?lw .

?bp ddids:label ?lbp .

?genotypePhenotype ddids:label ?lgenotypePhenotype .

?gene ddids:label ?lgene .

?gene2 ddids:label ?lgene2 .

?gene3 ddids:label ?lgene3 .

?bp2 ddids:label ?lbp2 .

?genotypePhenotype2 ddids:label ?lgenotypePhenotype2 .


FILTER ( ?r = ddids:may_treat ||  ?r = ddids:may_prevent )

FILTER (?omimt1 = ddids:gene || ?omimt1 = ddids:genotypePhenotype )

FILTER (?wCLASS = ddids:pathway || ?r2 = ddids:gene_is_element_in_pathway )

FILTER (?bpCLASS = ddids:biologicalProcess )

FILTER (?bp2CLASS = ddids:biologicalProcess )

FILTER (?genotypePhenotypeCLASS = ddids:genotypePhenotype )

FILTER (?genotypePhenotype2CLASS = ddids:genotypePhenotype )

FILTER (?gCLASS = ddids:gene )

FILTER (?g2CLASS = ddids:gene )

FILTER (?g3CLASS = ddids:gene )

}


    We still know little about your setup.



            and my PC is Mac Pro with
            2.4 GHZ and 4GB of memory.


    Java 32 bit or 64 bit?


*java version "1.6.0_65"*
*Java(TM) SE Runtime Environment (build 1.6.0_65-b14-462-11M4609)*
*Java HotSpot(TM) 64-Bit Server VM (build 20.65-b04-462, mixed mode)*




            I was not able to use QueryExecution.__execConstructTriples as
            it returnees an iterator and I want to save the subgraph
            into a new TDB .


    Why is that a problem? Add them to a TDB database.


* I will try it and let you know. *


    Or even use a SPARQL Update operation.


*SPARQL update if I am not wrong will not work in my case as I want to
create a subgraph from the whole data and store it in a new TDB. We can
use only the SPARQL update if we can add data on the original TDB; AM I
right ? *

    **



            Here is my code below:

               FileLoader fileLoader = new
            FileLoader("src/DDICONSTRUCT.__tql");

               String q = fileLoader.loadAll();

               Query query = QueryFactory.create(q) ;

               QueryExecution qexec =
            QueryExecutionFactory.create(__query, data.tdb);


               Model constructModel = qexec.execConstruct();


            The program has been running for almost a day now, let me
            know if there is
            something wrong or if there is an alternative to  CONSTRUCT
            thing.



            On Sun, Nov 3, 2013 at 12:59 PM, Andy Seaborne
            <a...@apache.org <mailto:a...@apache.org>> wrote:

                On 03/11/13 07:05, Adeeb Noor wrote:

                    Hi Andy:

                    I did figure it out, however it takes to much time
                    (CONSTRUCT) to finish
                    as
                    my query is complex. Is that something normal ? in
                    fact, it is still
                    running


                Hard to tell - it depends on many factors such as
                machine setup, where
                the data is stored, structure and volume of your data

                Try

                QueryExecution.__execConstructTriples

                          Andy



                    AdeeB


                    On Sat, Nov 2, 2013 at 9:56 AM, Adeeb Noor
                    <adeeb.n...@colorado.edu
                    <mailto:adeeb.n...@colorado.edu>>
                    wrote:

                       Hi Andy:


                        Thanks for the quick response. I tried CONSTRUCT
                        and it did work out.
                        But
                        how can I reformat such a query to CONSTRUCT one:

                        SELECT DISTINCT *

                            {

                             ?ddi ddids:has_association ?c .

                            ?ddi ddids:has_association ?c2 .

                        ?c ddids:chemical_or_drug___affects_gene_product
                        ?omim .

                        ?omim ddids:gene_product_encoded_by___gene ?g .

                        ?g ddids:gene_plays_role_in___process ?w .

                        ?g ddids:gene_plays_role_in___process ?bp .

                        ?bp ddids:process_involves_gene ?g2 .

                        ?g2 ddids:gene_plays_role_in___process ?bp2 .


                        where I need each variable ( for example ?w, ?bp
                        , etc) to be a new
                        resources.

                        Thanks


                        On Sat, Nov 2, 2013 at 6:41 AM, Andy Seaborne
                        <a...@apache.org <mailto:a...@apache.org>> wrote:

                           You need to use a CONSTRUCT query, not a
                        SELECT one.


                            outputAsRDF encodes the result set (i.e. the
                            table) as RDF - it is not
                            the datamodel of the original data.

                            CONSTRUCT allows you to create one RDF graph
                            from data from another.

                            See also SPARQL Update for doign that from
                            one graph to another in the
                            same database.

                                       Andy


                            On 02/11/13 05:35, Adeeb Noor wrote:

                               Hi guys:


                                I would like to save my SPARQL result
                                coming from ResultSet into new
                                rdf.
                                (new rdf resources) cause I want to do
                                more work on this subgraph and
                                it
                                has to be in the original rdf format.

                                I tried outputAsRDF function and it
                                worked however the result I got
                                the
                                following:

                                <rdf:Description rdf:nodeID="A5">
                                        <rs:value rdf:resource="
                                
https://csel.cs.colorado.edu/~__noor/Drug_Disease_ontology/
                                
<https://csel.cs.colorado.edu/~noor/Drug_Disease_ontology/>
                                DDID.owl#genotypePhenotype
                                "/>
                                        <rs:variable>omimt</rs:__variable>
                                      </rdf:Description>
                                      <rdf:Description rdf:nodeID="A6">
                                        <rs:value rdf:resource="
                                
https://csel.cs.colorado.edu/~__noor/Drug_Disease_ontology/
                                
<https://csel.cs.colorado.edu/~noor/Drug_Disease_ontology/>
                                DDID.rdf#C0007589
                                "/>
                                        <rs:variable>w</rs:variable>
                                      </rdf:Description>
                                      <rdf:Description rdf:nodeID="A7">
                                        <rs:binding rdf:nodeID="A8"/>
                                        <rs:binding rdf:nodeID="A9"/>
                                        <rs:binding rdf:nodeID="A10"/>
                                        <rs:binding rdf:nodeID="A11"/>
                                        <rs:binding rdf:nodeID="A12"/>
                                        <rs:binding rdf:nodeID="A13"/>
                                        <rs:binding rdf:nodeID="A14"/>
                                        <rs:binding rdf:nodeID="A15"/>
                                        <rs:binding rdf:nodeID="A16"/>
                                        <rs:binding rdf:nodeID="A17"/>
                                        <rs:binding rdf:nodeID="A18"/>
                                        <rs:binding rdf:nodeID="A19"/>
                                        <rs:binding rdf:nodeID="A20"/>
                                        <rs:binding rdf:nodeID="A21"/>
                                        <rs:binding rdf:nodeID="A22"/>
                                        <rs:binding rdf:nodeID="A23"/>
                                        <rs:binding rdf:nodeID="A24"/>
                                        <rs:binding rdf:nodeID="A25"/>
                                        <rs:binding rdf:nodeID="A26"/>
                                        <rs:binding rdf:nodeID="A27"/>
                                      </rdf:Description>

                                how I can remove this nodes things and
                                make it something like:

                                     <rdf:Description rdf:about="
                                
https://csel.cs.colorado.edu/~__noor/Drug_Disease_ontology/
                                
<https://csel.cs.colorado.edu/~noor/Drug_Disease_ontology/>
                                DDID.rdf#C3229174">
                                        <j.0:label>Cytra-K Oral
                                Product</j.0:label>
                                        <rdf:type rdf:resource="
                                
https://csel.cs.colorado.edu/~__noor/Drug_Disease_ontology/
                                
<https://csel.cs.colorado.edu/~noor/Drug_Disease_ontology/>
                                DDID.owl#chemical
                                "/>
                                      </rdf:Description>

                                please help me out





                        --
                        Adeeb Noor
                        Ph.D. Candidate
                        Dept of Computer Science
                        University of Colorado at Boulder
                        Cell: 571-484-3303 <tel:571-484-3303>
                        Email: adeeb.n...@colorado.edu
                        <mailto:adeeb.n...@colorado.edu>








            --
            Adeeb Noor
            Ph.D. Candidate
            Dept of Computer Science
            University of Colorado at Boulder
            Cell: 571-484-3303 <tel:571-484-3303>
            Email: adeeb.n...@colorado.edu <mailto:adeeb.n...@colorado.edu>








--
Adeeb Noor
Ph.D. Candidate
Dept of Computer Science
University of Colorado at Boulder
Cell: 571-484-3303
Email: adeeb.n...@colorado.edu <mailto:adeeb.n...@colorado.edu>

Re: Store query results in new RDF

Reply via email to