[MarkLogic Dev General] Working with graphml

2017-05-24 Thread Rajesh Kumar
Hi Team,

Is there any plugin available to import or export .graphml data in MarkLogic

Thanks & Regards,
Rajesh
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] General Digest, Vol 155, Issue 32

2017-05-24 Thread Shiv Shankar
Thanks Gary,

Looks like ML 8.5.5 /9.x doesn't have cts:maximum-value, cts:minimum-value.
I tried with cts:min, cts:max, throwing
XDMP-NONMIXEDCOMPLEXCONT error. Any advise?

Thanks
Shan.

On Wed, May 24, 2017 at 7:44 AM, <general-requ...@developer.marklogic.com>
wrote:

> Send General mailing list submissions to
> general@developer.marklogic.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://developer.marklogic.com/mailman/listinfo/general
> or, via email, send a message with subject or body 'help' to
> general-requ...@developer.marklogic.com
>
> You can reach the person managing the list at
> general-ow...@developer.marklogic.com
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of General digest..."
>
>
> Today's Topics:
>
>1. Ignoring empty/null values while search (Gary Vidal)
>2. Priorities for queries (Gary Vidal)
>3. Re: Schema validation fails for certain XMLs
>   (Karunanithi, Bharathi)
>4. Re: Schema validation fails for certain   XMLs #CGO#
>   (Jain, Abhishek)
>
>
> --
>
> Message: 1
> Date: Wed, 24 May 2017 06:56:47 -0400
> From: Gary Vidal <gvidalsass...@gmail.com>
> Subject: [MarkLogic Dev General] Ignoring empty/null values while
> search
> To: general@developer.marklogic.com
> Message-ID:
> <CANYhtWf5ps-Bmvgz2KRG_oPA-LbNrJqG18aQvduZ3NBKkprYKw@
> mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Shiv,
>
> The problem is quite simple to solve.  The first thing is to determine what
> is empty.  Mostly empty can be the non-existence of the element and second
> an empty element.  Simply just add an additional negated
>  cts.notQuery(cts.propertyValueQuery(...)).  This will filter out empty
> dob
> documents during projection of results.
>
> Q2.
>
> What you are expecting cannot be done simply because age is a temporal
> value and MarkLogic operates on stored values.  So you cannot simple
> determine age without coding it for each document.  But lets make some
> assumptions that dob is just xs:date value and the simplest way to extract
> out each bucket is quite simple:
>
> xquery version "1.0-ml";
> (:Get current year:)
> let $today := fn:current-date()
> (:Get Current Range of DOB to get years:)
> let $dob-ranges := cts:value-ranges(
>cts:json-property-reference("dob"),
>()
> )
> (:Get range of years to calculate
>   Any arithmetic between to dates results in a duration.
> :)
> let $years   := fn:years-from-duration($today -
> $dob-ranges/cts:minimum-value) to fn:years-from-duration($today -
> $dob-ranges/cts:maximum-value)
> let $year-dates  := $years ! ($today -
> xs:yearMonthDuration(fn:concat("P",.,"Y")))
> let $year-ranges := cts:value-ranges(
>cts:json-property-reference("dob"),
>$year-dates
>)
> let $year-map := json:object-define($years ! fn:string(.))
> return (
> (:Iterate all year-ranges and calculate age :)
>for $yr in $year-ranges
>    let $year := fn:years-from-duration($today - $yr/cts:maximum-value)
>return
>   map:put($year-map,$year,cts:frequency($yr)),
> $year-map
> )
>
> (:Enjoy:)
> -- next part --
> An HTML attachment was scrubbed...
> URL: http://developer.marklogic.com/pipermail/general/
> attachments/20170524/7f2b48be/attachment-0001.html
>
> --
>
> Message: 2
> Date: Wed, 24 May 2017 06:58:17 -0400
> From: Gary Vidal <gvidalsass...@gmail.com>
> Subject: [MarkLogic Dev General] Priorities for queries
> To: general@developer.marklogic.com
> Message-ID:
> 

Re: [MarkLogic Dev General] Creating private documents through ReST

2017-05-24 Thread Hans Hübner
On Wed, May 24, 2017 at 2:23 PM, Erik Hennum 
wrote:

> Hi, Hans:
>
> You can think of the rest-reader and rest-writer roles as system roles.
>
> If you don't assign those roles to any application user, the user's won't
> be able to see each other's documents.
>
> Instead, you assign the rest-reader and the rest-writer execute privileges
> to users.
>

Thanks, Erik!  This helps a lot!

-Hans

-- 
LambdaWerk GmbH
Oranienburger Straße 87/89
10178 Berlin
Phone: +49 30 555 7335 0
Fax: +49 30 555 7335 99

HRB 169991 B Amtsgericht Charlottenburg
USt-ID: DE301399951
Geschäftsführer:  Hans Hübner

http://lambdawerk.com/
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Processing Large Number of Docs to Get Statistics

2017-05-24 Thread Eliot Kimber
I got what I needed by creating a simple groovy script that uses the XCC 
library to submit queries. Script is below. My main discovery was that I need 
to create a new session for every iteration to avoid connection time outs. With 
this I was able to process several 100 thousand docs and accumulate the results 
on my local machine. My command line is:

groovy -cp lib/xcc.jar GetArticleMetadataDetails.groovy

I chose groovy because it supports Java libraries directly and makes it easy to 
script things.

Groovy script:

#!/usr/bin/env groovy
/*
 * Use XCC jar to run enrichment jobs and collect the results.
 */
 
import com.marklogic.xcc.*;
import com.marklogic.xcc.types.*;
 
ContentSource source = ContentSourceFactory.newContentSource("myserver", 1984, 
"user", "pw");

RequestOptions options = new RequestOptions();
options.setRequestTimeLimit(3600)

moduleUrl = "rq-metadata-analysis.xqy"

println "Running module ${moduleUrl}..."
println new Date()
File outfile = new File("query-result.xml")

outfile.write "\n";

 
(36..56).each { index ->
Session session = source.newSession();
ModuleInvoke request = session.newModuleInvoke(moduleUrl)

println "Group number: ${index}, ${new Date()}"
request.setNewIntegerVariable("", "groupNum", index);
request.setNewIntegerVariable("", "length", 1);

request.setOptions(options);

ResultSequence rs = session.submitRequest(request);

ResultItem item = rs.next();
XdmItem xdmItem = item.getItem();
InputStream is = item.asInputStream();

is.eachLine { line ->
  outfile.append line
  outfile.append "\n"
}
session.close();
}

outfile.append "";

println "Done."
//  End of script.

--
Eliot Kimber
http://contrext.com
 



On 5/22/17, 10:43 PM, "general-boun...@developer.marklogic.com on behalf of 
Eliot Kimber"  wrote:

I haven’t yet seen anything in the docs that directly address what I’m 
trying to do and suspect I’m simply missing some ML basics or just going about 
things the wrong way.

I have a corpus of several hundred thousand docs (but could be millions, of 
course), where each doc is an average of 200K and several thousand elements.

I want to analyze the corpus to get details about the number of specific 
subelements within each document, e.g.:


for $article in cts:search(/Article, cts:directory-query("/Default/", 
"infinity"))[$start to $end]
 return 

I’m running this as a query from Oxygen (so I can capture the results 
locally so I can do other stuff with them).

On the server I’m using I blow the expanded tree cache if I try to request 
more than about 20,000 docs.

Is there a way to do this kind of processing over an arbitrarily large set 
*and* get the results back from a single query request?

I think the only solution is to write the results to back to the database 
and then fetch that as the last thing but I was hoping there was something 
simpler.

Have I missed an obvious solution?

Thanks,

Eliot

--
Eliot Kimber
http://contrext.com
 



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general



___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Priorities for queries

2017-05-24 Thread Oleksii Segeda
Gary,

Please correct me if I’m wrong, but this will only parallelize queries without 
addressing priorities. This means if one of them creates a lot of disk IO, the 
second one hangs.

Best,
Oleksii



From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Gary Vidal
Sent: Wednesday, May 24, 2017 6:58 AM
To: general@developer.marklogic.com
Subject: [MarkLogic Dev General] Priorities for queries

Oleksii,

Why dont you just create 2 app servers.  1 for query traffic and 1 for admin

Regards

Gary
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Schema validation fails for certain XMLs #CGO#

2017-05-24 Thread Jain, Abhishek
Hi Bharathi,

It's not a solution but will help in getting around a solution.
I think proper understanding of lax and Strict will help here. I think for any 
new element (in your case its )
Validation will obviously fail if it doesn't comply with XSDs.

https://msdn.microsoft.com/en-us/library/dd297062.aspx


Thanks and Regards,
Abhishek Jain

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Karunanithi, 
Bharathi
Sent: Wednesday, May 24, 2017 4:58 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] Schema validation fails for certain XMLs

Can I please get any help on the below issue?

From: 
general-boun...@developer.marklogic.com
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Karunanithi, 
Bharathi
Sent: Tuesday, May 23, 2017 5:20 PM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] Schema validation fails for certain XMLs

Hi Team,

I am getting the following validation error in MarkLogic 8.0-6.4 hosted on a 
AWS EC2 instance. But certain xml files fails the validation against the Schema 
particularly in this environment alone.
Same XML passes the validation in other ML environments and also from other 
standalone validation tools using Schemas.

The used Schema files are attached in Zip format. Major Schema file being 
pam.xsd from the zip.
Sample XML for which the error is occurring and the error are also attached 
with this email.

Error: Please note that this error only occurs for a xml with a line break 
tags( or ). XML files without any line breaks are passing the 
Schema Validation.
XDMP-VALIDATEUNEXPECTED: (err:XQDY0027) validate lax { $n } -- Invalid node: 
Found @clear but expected (@title? & @class? & @id?) at 
fn:doc("//swcnpny048/2017/Print/Lucky/20170306/XML/GW201703_11_11.xml")/pam:message/pam:article/pam-xhtml:body/pam-xhtml:h1/pam-xhtml:br/@clear
 using schema "pam.xsd"

Script:
import module namespace validate = 
"http://condenast.com/dam/2.0/lib/validation; at 
"/application/lib/validation-lib.xqy";

import module namespace mapping = "http://condenast.com/dam/2.0/lib/mapping; at 
"/application/lib/mapping-lib.xqy";
import module namespace util = 
"http://condenast.com/dam/2.0/lib/validation-utils; at 
"/application/lib/validation-utils.xqy";

declare namespace cndam = "http://condenast.com/dam/2.0;;

let $path:= "//swcnpny048/2017/Print/Lucky/20170306/XML/GW201703_11_11.xml"
let $assetType:= mapping:get-asset-type-by-extension($path)
let $content := xdmp:document-get($path)
let $schema := validate:get-schema-from-asset-type($assetType)
let $results := util:validate-with-schema(
 $content,
 $schema/cndam:namespace/text(),
 $schema/cndam:schemaLocation/text()
)

return $results

Please tell me if I am missing any configurations? Or any?

Thanks,
Bharathi K


This message contains information that may be privileged or confidential and is 
the property of the Capgemini Group. It is intended only for the person to whom 
it is addressed. If you are not the intended recipient, you are not authorized 
to read, print, retain, copy, disseminate, distribute, or use this message or 
any part thereof. If you receive this message in error, please notify the 
sender immediately and delete all copies of this message.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Priorities for queries

2017-05-24 Thread Gary Vidal
Oleksii,

Why dont you just create 2 app servers.  1 for query traffic and 1 for admin

Regards

Gary
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Ignoring empty/null values while search

2017-05-24 Thread Gary Vidal
Shiv,

The problem is quite simple to solve.  The first thing is to determine what
is empty.  Mostly empty can be the non-existence of the element and second
an empty element.  Simply just add an additional negated
 cts.notQuery(cts.propertyValueQuery(...)).  This will filter out empty dob
documents during projection of results.

Q2.

What you are expecting cannot be done simply because age is a temporal
value and MarkLogic operates on stored values.  So you cannot simple
determine age without coding it for each document.  But lets make some
assumptions that dob is just xs:date value and the simplest way to extract
out each bucket is quite simple:

xquery version "1.0-ml";
(:Get current year:)
let $today := fn:current-date()
(:Get Current Range of DOB to get years:)
let $dob-ranges := cts:value-ranges(
   cts:json-property-reference("dob"),
   ()
)
(:Get range of years to calculate
  Any arithmetic between to dates results in a duration.
:)
let $years   := fn:years-from-duration($today -
$dob-ranges/cts:minimum-value) to fn:years-from-duration($today -
$dob-ranges/cts:maximum-value)
let $year-dates  := $years ! ($today -
xs:yearMonthDuration(fn:concat("P",.,"Y")))
let $year-ranges := cts:value-ranges(
   cts:json-property-reference("dob"),
   $year-dates
   )
let $year-map := json:object-define($years ! fn:string(.))
return (
(:Iterate all year-ranges and calculate age :)
   for $yr in $year-ranges
   let $year := fn:years-from-duration($today - $yr/cts:maximum-value)
   return
  map:put($year-map,$year,cts:frequency($yr)),
$year-map
)

(:Enjoy:)
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] concurrent invocation of xquery ending up with duplicate writes

2017-05-24 Thread Gary Vidal
Raghu,

I am sure there is more to issue than you are stating and if you attempt to
read and possibly write, then you are most definitely in write mode all the
time, which will not scale.

The best way to solve problem is to lock the document from the writer user,
if the document does not exist and acquire lock from writer.  This allows
all read threads to be held if a write has to occur. This can be a
challenge if you have too much insert logic, but assuming it is as simple
as xdmp:document-insert, you can try the following:

let $doc := $some-doc-logic
let $exists := $my-logic-to-check-existence
let $constraint-key := local:some-func-to-create-key($doc)
return
   if($exists and $constraint-key) then $doc
   else
  xdmp:spawn-function(function() {
   let $mutex :=  $some-mutex-key-strategy
   return (
  xdmp:lock-for-update($mutex),
  xdmp:document-insert($doc-uri,$doc),
  $doc
   )
  },
  
  update-auto-commit
  true
  write-user
  )

HTH,

Gary Vidal
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general