Re: [MarkLogic Dev General] (no subject)

2018-01-25 Thread C. Yaswanth
Thanks Justin,

As, You suggested i went through MarkLogic Data Movement SDK  in java api
but i find two ways of invoking a transform module .

   1) One way is to install a transform in the server and then
using Data movement SDK.(i.e.http://docs.marklogic.com/guide/java/transforms
)
2)Other way is to install invoke a module .(i.e.
https://docs.marklogic.com/guide/java/resourceservices#id_84134)

I want to know , using first way only i can implement SDK or even if use
other way too i can do that .

Thanks
Yaswanth C

On Thu, Jan 25, 2018 at 1:45 AM, Justin Makeig 
wrote:

> You’re trying to update all of the documents in the “input” collection in
> a single transaction, the default scope of a JavaScript module. For small
> numbers of documents (hundreds or thousands) that will work. For large or
> unknown numbers of documents that will generally overwhelm some resource.
> In your case, you’ve blown out your expanded tree cache. Your best bet is
> to break the transformation into multiple transactions and spread those out
> over multiple hosts in a cluster. The MarkLogic Data Movement SDK is
> designed for exactly such workloads. Take a look at the docs <
> https://docs.marklogic.com/guide/java/data-movement#id_51555> for more
> information on how to orchestrate a bulk transformation from Java.
>
> Justin
>
> --
> Justin Makeig
> Senior Director, Product Management
> MarkLogic
> jmak...@marklogic.com
>
> On Jan 24, 2018, at 10:19 AM, C. Yaswanth  wrote:
>
> Hi All,
>
> Actually i have a set of json files(i.e.Total : 1M with Size 500MB). Each
> json file has 18 Keys. I tried to implement Envelope pattern using below
> Javascript
>
> 'use strict';
> declareUpdate()
> var docs = fn.collection("input");
> for(var doc of docs) {
>   var transformed = {};
>
>   transformed.Metadata = { "Last Used" : ""};
>   transformed.Updated = { "University" : "UCLA"}
>   transformed.Source = doc; //Sending original data under Source
> section
>   xdmp.nodeReplace(doc,transformed)
> }
>
> I tried invoking this `JS.sjs` using JAVA API of marklogic 9. But i
> encountered below error :
>
> Exception in thread "main" com.marklogic.client.FailedRequestException:
> Local message: failed to apply resource at invoke: Internal Server Error.
> Server Message: XDMP-EXPNTREECACHEFULL: for(var doc of docs) { -- Expanded
> tree cache full on host localhost uri file.json-0-968991
>  at com.marklogic.client.impl.OkHttpServices.checkStatus(OkHttpS
> ervices.java:4317)
>  at com.marklogic.client.impl.OkHttpServices.postIteratedResourc
> eImpl(OkHttpServices.java:3831)
>  at com.marklogic.client.impl.OkHttpServices.postEvalInvoke(OkHt
> tpServices.java:3768)
>  at com.marklogic.client.impl.ServerEvaluationCallImpl.eval(Serv
> erEvaluationCallImpl.java:164)
>  at com.marklogic.client.impl.ServerEvaluationCallImpl.eval(Serv
> erEvaluationCallImpl.java:153)
>  at com.marklogic.client.impl.ServerEvaluationCallImpl.evalAs(Se
> rverEvaluationCallImpl.java:144)
>  at bulkimport.Tsm.main(Tsm.java:19)
>
> I went through documentation (i.e.https://help.marklogic.
> com/knowledgebase/article/View/9/16/resolving-xdmp-expnt
> reecachefull-errors)  where they had mentioned ways to resolve this
> error. Following that i had increased `expanded tree cache size*` to `2048`
> but still i am facing same error.
>
> How can i optimize by above code (i.e.`JS.sjs`) to avoid this error ?
>
> Any help is appreciated.
> ___
> General mailing list
> General@developer.marklogic.com
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
>
>
> ___
> General mailing list
> General@developer.marklogic.com
> Manage your subscription at:
> http://developer.marklogic.com/mailman/listinfo/general
>
>
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Why Would FlexRep Pull be Dramatically Slower Than Push for Same Database and Server Pair?

2018-01-25 Thread Eliot Kimber
I have a pair of ML 9 servers. 

On the master server I have a domain with a target configured with a docs per 
batch of 100 for a database with about 380K docs coming in at about 3GB 
reported by the ML status page.

When I use FlexRep push to another server with an empty database the push 
takes about an hour to 2 hours depending on time of day (and thus overall 
network traffic).

When I use FlexRep pull to pull from master to the remote, it takes about 9 
hours.

What would account for this time difference? I'm guessing it's that the pull 
process doesn't use the docs/batch setting (which if I manually set it to 1 for 
a push also results in about 9 hours).

As it happens, I don't need to use pull as I can use push just as easily, but I 
was just curious about the time difference and whether it's an inherent aspect 
of FlexRep pull, indicates a bug, or could be some configuration error on my 
part (but I don't think so since the target configuration is the same in both 
cases--the only variable is pull vs push).

Cheers,

Eliot
--
Eliot Kimber
http://contrext.com
 



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] question about transactions

2018-01-25 Thread Erik Zander
Thanks David and Geert,

I might have made a confusion as to why I wanted to do both insert and read.
Also Geerts question sparked me to have another go at looking at the 
documentation and to great benefit.

My goal was to have a function to call a sparql endpoint for a specific 
question.
To ease the load on the endpoint I wanted to store the rdf in Markolgic to 
cache it.

Now I ended up doing pretty much the same but using an in-memory-store.

So code ended up like.

declare function wdCon:getKeyDataforQid($QId as xs:string, $lang as xs:string?) 
as node()*

{
//Variables etc //
let $data := 
sem:query-results-serialize(sem:sparql($sparqlML))//spql:results/spql:result/spql:binding[@name
 = "label"]/spql:literal/text()
return
if ($data) then
$data
else
let $sparqlWD :=
"
PREFIX wd: 
PREFIX rdfs: 
describe  ?label WHERE {
   " || $QId || "  rdfs:label ?label
   FILTER (langMatches( lang(?label), '" || $qlang || "' ) )
}"
let $wDataRdf := wdCon:queryWithSparql($sparqlWD)
let $wDSemTrip := sem:rdf-parse($wDataRdf,("rdfxml", 
"graph=http://www.wikidata.org;))
let $_rdf_insert := 
sem:rdf-insert($wDSemTrip,(),$conf:defaultPermissions  )
let $semStore := sem:in-memory-store($wDSemTrip)


return 
sem:query-results-serialize(sem:sparql($sparqlML,(),(),$semStore) 
)//spql:results/spql:result/spql:binding[@name = "label"]/spql:literal/text()
};


Regards and thanks
Erik

Från: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] För Geert Josten
Skickat: den 25 januari 2018 06:50
Till: MarkLogic Developer Discussion 
Ämne: Re: [MarkLogic Dev General] question about transactions

The outer query runs in query mode, so runs against the timestamp of initial 
invocation, causing it to never see the result of sem:rdf-insert. You'd have to 
put the sem:sparql in an xdmp:eval with different-transaction as well.

I also wonder though: what are you trying to do, why trying to squeeze insert 
and read in one request?

Cheers,
Geert

From: 
>
 on behalf of David Ennis 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, January 24, 2018 at 7:34 PM
To: MarkLogic Developer Discussion 
>
Subject: Re: [MarkLogic Dev General] question about transactions

Please look up the options for xdmp:eval and note the following options 
explained there:
- transaction-mode
- isolation

Then change your eval to have the following options:
- transaction-mode=update-auto-commit
- isolation = different transaction

Then move the sem:sparql statement below the eval in your main code.

What are you doing here?

You are telling the insert to run as a separate transaction and auto-commit. 
This makes the triples available immediately after the eval is done. Therefore, 
you should run the select in the main code and not the isolated transaction.

Careful with the use of different transactions via eval and invoke. The wrong 
combination can get you into a deadlock.

Regards,
David Ennis

--


From: 
>
 on behalf of Erik Zander 
>
Reply-To: MarkLogic Developer Discussion 
>
Date: Wednesday, January 24, 2018 at 5:35 PM
To: MarkLogic Developer Discussion 
>
Subject: [MarkLogic Dev General] question about transactions

Hi All,

I have a question about I think transactions.

I want to insert some rdf and then query the database, and I want to do this in 
a function so I can call the function and depending on if I have the data in 
Marklogic or not get the data as rdf and insert it.

But my problem is that the following code only returns result second time I 
call it.
I'm thankful for pointers here
Regards
Erik
Code below
==

xquery version "1.0-ml"encoding "utf-8";


import module namespace sem="http://marklogic.com/semantics;
  at"/MarkLogic/semantics.xqy";
declare namespace 
rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#";

let $wDataRdf:=
http://www.w3.org/1999/02/22-rdf-syntax-ns#"
  
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"