[MarkLogic Dev General] Priorities for queries
Olekseii, Why would you want to prioritize queries the way you expressed? It would not make sense to deprioritize disk i/o from happening unless you have some issues with disk performance. Consider disk i/o from stand merges to be a natural part of doing business in MarkLogic and any system that does "log level compaction". If you are creating documents in bulk and at same time running queries there are a few techniques you could employ, such as using "fast-data directory" attached to SSD or figure out why your disk's are slow using dd command. Again without knowing your write/read patterns and cardinalities/shape of your data its a very hard problem to answer correctly. But you may want to look at pausing stand merges using blackouts for periods of high query load. But this should be done with extreme caution to your query patterns. Happy to discuss directly with you. Feel free to email for that discussion. Regards, Gary Vidal ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Processing Large Number of Docs to Get Statistics
Eliot, I will share some code I wrote using Apache Flink, which does exactly what you want to do for MarkLogic on a client machine. The problem is with such an old version of ML you are forced to pull every document out and perform analysis externally. In my previous life I wrote a version that runs on MarkLogic using spawn and parallel tasks, but not sure it would work on 4.2, but will share for sake of others. Feel free to contact me directly for any additional help https://github.com/garyvidal/ml-libraries/tree/master/task-spawner ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] (no subject)
Shiv, My apologies as I pull the information from my head. The correct values are cts:minimum and cts:maximum https://docs.marklogic.com/cts:value-ranges?q=cts:value-ranges=9.0=true See examples for output. Regards, Gary Vidal ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Processing Large Number of Docs (Eliot Kimber)
Elliot, There is a certain way to make XCC stream results out without any timeout. First in your xquery you can use the following gist to perform let free xquery https://blakeley.com/blogofile/2012/03/19/let-free-style-and-streaming/ https://gist.github.com/mblakele/2127371 In your xcc code turn off caching of results which is a performance bottleneck and prevents streaming https://docs.marklogic.com/guide/xcc/concepts#id_67763 HTH, Gary Vidal ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Priorities for queries
Oleksii, Why dont you just create 2 app servers. 1 for query traffic and 1 for admin Regards Gary ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Ignoring empty/null values while search
Shiv, The problem is quite simple to solve. The first thing is to determine what is empty. Mostly empty can be the non-existence of the element and second an empty element. Simply just add an additional negated cts.notQuery(cts.propertyValueQuery(...)). This will filter out empty dob documents during projection of results. Q2. What you are expecting cannot be done simply because age is a temporal value and MarkLogic operates on stored values. So you cannot simple determine age without coding it for each document. But lets make some assumptions that dob is just xs:date value and the simplest way to extract out each bucket is quite simple: xquery version "1.0-ml"; (:Get current year:) let $today := fn:current-date() (:Get Current Range of DOB to get years:) let $dob-ranges := cts:value-ranges( cts:json-property-reference("dob"), () ) (:Get range of years to calculate Any arithmetic between to dates results in a duration. :) let $years := fn:years-from-duration($today - $dob-ranges/cts:minimum-value) to fn:years-from-duration($today - $dob-ranges/cts:maximum-value) let $year-dates := $years ! ($today - xs:yearMonthDuration(fn:concat("P",.,"Y"))) let $year-ranges := cts:value-ranges( cts:json-property-reference("dob"), $year-dates ) let $year-map := json:object-define($years ! fn:string(.)) return ( (:Iterate all year-ranges and calculate age :) for $yr in $year-ranges let $year := fn:years-from-duration($today - $yr/cts:maximum-value) return map:put($year-map,$year,cts:frequency($yr)), $year-map ) (:Enjoy:) ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] concurrent invocation of xquery ending up with duplicate writes
Raghu, I am sure there is more to issue than you are stating and if you attempt to read and possibly write, then you are most definitely in write mode all the time, which will not scale. The best way to solve problem is to lock the document from the writer user, if the document does not exist and acquire lock from writer. This allows all read threads to be held if a write has to occur. This can be a challenge if you have too much insert logic, but assuming it is as simple as xdmp:document-insert, you can try the following: let $doc := $some-doc-logic let $exists := $my-logic-to-check-existence let $constraint-key := local:some-func-to-create-key($doc) return if($exists and $constraint-key) then $doc else xdmp:spawn-function(function() { let $mutex := $some-mutex-key-strategy return ( xdmp:lock-for-update($mutex), xdmp:document-insert($doc-uri,$doc), $doc ) }, update-auto-commit true write-user ) HTH, Gary Vidal ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] json:config for XML schema (David Lee) (David Lee)
>>>[DAL] I am not saying they are 'deficient' , I am saying that because they require an *already created* document to query, they are not generally useful as a means of determining how to construct that document in the first place. [GV] Not necessarily, transformation is a bit different than transliteration. I think he was suggesting that XML->JSON is a transliteration vs trying to match a different schema. In terms of already created documents, as a "Framework" engineer, not a developer, I make no assumptions about what data is represented by a schema or an xml document. The presumption is that if you want a framework based approach vs a hand-coded integration you live with trade-offs and using sc:* allows you to reflect the type of document you have passed and then allow them to make general configuration changes to augment the library using High Order Functions, Dependency Injection or Pointcuts for refinement. The biggest problem with XQuery development is people embed their transformations and understanding into their code, so everytime the data model changes their code has to change to reflect it. In my opinion this is worse than some rigidity. The schema is the construction template and the validation to prove it. As noted, I have code that does partial updates (PATCH) against a set of zones within the XML which must be defined according to multiple schema Types and versions. The generic pattern allows me to create configuration driven update processing without committing to a bunch of hardcoded dependencies like recursive typeswitches or nested if statements. I have had to think very hard how to achieve this given all the permutations of XML schema and its target representations. Again there are cases I have not touched, but a framework can abstract that out so it becomes a bug vs quitting because you dont feel you are getting exactly what you want. >>>[DAL] I'll up the ante one further. If you already have the source and target document 'in mind' you don?t even need the sc:* functions. [GV] - Because like reflection you may know whats in the document, but do you know what should be in the document. The schema tells you this, your developer will not. So consider any framework that had to be recoded for every client because the XML and Schema were different. Writing "Schema Aware" code is harder than writing a bunch of hand-coded xquery modules and while I have to explain to the PM's who dont understand when I make a change its system wide and affects more than their seemingly tiny ask. >>>[DAL] problem only arises in the first place when the target system cannot directly represent 64 bit integers as 'Numbers' -- That is only problematic if you need to do numeric operations on them. Otherwise its much easier to 'pass along' a string value, display it, even do inequality operations then a structured value. It is also more compact, more readable and more efficient. [GV] So everyone is so happy to use a deficient transport format that cannot handle anything aside from String,Number and Integer. So being a purist is non-sensical if you ever consider JSON/Javascript a real standard/language. Again readability vs correctness debate. But using your XML schema you can pass this as JSON Schema to your client libraries and will interpret them according to schema vs some developer reading the JSON without any knowledge of its nuances. Again have built this feature and happy to share once I can recreate from my head again. >>>[DAL] To date, I have not yet run into a customer case that had an XML schema for a JSON document they wanted to transform. So my 'gut feeling' would be its more useful to write the json schema directly. [GV] The truth is most customers using MarkLogic have legacy XML that they are trying to modernize for client consumption. I have rarely had the opportunity to build from scratch a data application using MarkLogic with pure JSON. For that work I just go with JSON Schema and javascript directly. Yet, I have many clients who are stuck with a ton of legacy xml content, but have downstream clients who require JSON and other variant formats (AVRO, Semantic Triples, etc). They dont have the luxury to maintain multiple copies and disjoint schema definitions to satisfy everyone's requirements. By building these abstractions out (even my own Schema DSL in XQuerrail) I can with some degree of confidence and fact, know my code works for all the cases I have encountered. And when it doesn't or cant, I can do dependency injection or write custom code to counter that. >>> [DAL:] Not true, my answers are my own Sure sorry was being snarky and welcome a healthy debate. But my experiences are different and should assume we are all trying to educate. Sometimes I operate on the fringes of the platform that are not understood or apparent to most people. My thoughts are very abstract because I refuse to write code that is brittle and dependent on any data model or
Re: [MarkLogic Dev General] json:config for XML schema (David Lee)
In the example it simply enumerates your schema based on the sc:type of a given document element and creates a search:options node you can pass into search:search. Here are some things I will share soon: - XML Schema to JSON Document Conversion - XML Schema to JSON Schema generation. - XML Schema to Partial Update with customized ordering semantics. Generates a builder function map that allows dependency injection of recursive typeswitch. - Any other reasonable requests. Regards, Gary Vidal sc-component-extensions.xqy Description: Binary data ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] json:config for XML schema
Well, the good news is if you have a schema you already know the definition of the structure you need to convert. The general issue is to deal with "mixed" content and linking @ref elements to their ultimate definition and things like xs:sequence vs xs:choice. The good news is MarkLogic has a library that can execute against the schema and provide you a means to create your own custom code to convert to JSON etc. if you look at the sc:* functions you can parse to get to schema. And then using a few functions to build out the structure you need create a function that does the transformation for you. I have some various code bits I can share if you need help. If you give me some time (say tomorrow) I can probably write the code to generate the json for you. Ping me directly if you need any help. Regards, Gary Vidal ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] question about permissions
You have no document permissions. When you inserted the content you probably used the admin role which means all docs in database have no permissions. YOu can confirm this by just running xdmp:document-get-permissions($uri) on any document. To solve this problem always load your content with permissions explicitly. If you dont have that luxury to reload you can use the task-server to spawn functions to add them. Make sure you have enough task queue depth on task server. let $perms := ( xdmp:permission("role","read"), ... ) return for $uri in cts:uris() return xdmp:spawn-function(function(){ xdmp:document-set-permissions($uri,$perms) }, update-auto-commit ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] marklogic xquery to extract users and roles (Eric Shevchuk)
Here is a basic script you can call from any database that has a security database. The trick is to use xdmp:invoke-function from your database to security database. Once your in security database just read the database as regular xml. For the sake of clarity you should never expose the security ids. A caveat of this is that it does not handle role recursion so you will have to solve that issue. But something I did in a few minutes I leave you as an exercise :-D import module namespace sec = "http://marklogic.com/xdmp/security; at "/MarkLogic/security.xqy"; xdmp:invoke-function(function() { let $users := /sec:user let $roles := map:new(/sec:role ! map:entry(fn:string(./sec:role-id),fn:string(./sec:role-name))) return for $user in $users return { $user/sec:user-name ! {fn:data(.)}, { for $role in $user/sec:role-ids/sec:role-id return {map:get($roles,fn:string($role))} } } }, {xdmp:security-database()} ) ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Better Javascript Binary Support?
I have been recently working on a geospatial project to extract SHP and DBF files from zip files. For those interested the code can be found here https://github.com/garyvidal/ml-libraries/tree/master/shpParser In using binary data from javascript it appears that there are limited features for binary data in MarkLogic javascript. Ideally a binary node should have some mechanism to convert to Uint8Array or to a native ArrayBuffer. After experimenting a bit I was able to write some standard functions to do this, but seems like a performance issue to marshal between xdmp.* functions. I would prefer a native solution Chrome extensions. Here are a few functions I think would be helpful: atob - Array to binary (standard conversions) btoa Binary to Array bin.toArrayBuffer(binary) - Returns the Uint8Array,buffer bin.toString(encoding) - should return the string from binary based on encoding. TextEncoder/TextDecoder classes Blob Support? Any thoughts on this matter? A few examples of things I did to support binaries in javascript (highly experimental) function BinToBuffer(bin) { var buff = xs.hexBinary(bin); var vals = buff.toString(); var byteLength = vals.length / MUL; MUL = 2 var buffer8 = new Uint8Array(byteLength); for(var byte = 0;byte <= byteLength;byte++) { buffer8[byte] = xdmp.hexToInteger(vals.substr((byte * MUL),MUL)) } return buffer8; } function Utf8ArrayToStr(array) { var out, i, len, c; var char2, char3; out = ""; len = array.length; i = 0; while(i < len) { c = array[i++]; switch(c >> 8) //was 4 { case 0: case 1: case 2: case 3: case 4: case 5: case 6: case 7: // 0xxx out += String.fromCharCode(c); break; case 12: case 13: // 110x 10xx char2 = array[i++]; out += String.fromCharCode(((c & 0x1F) << 6) | (char2 & 0x3F)); break; case 14: // 1110 10xx 10xx char2 = array[i++]; char3 = array[i++]; out += String.fromCharCode(((c & 0x0F) << 12) | ((char2 & 0x3F) << 6) | ((char3 & 0x3F) << 0)); break; } } return out; } ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Data Modelling for "option lists" (Anne Taylor)
value FILTER(?value = $regionQuery) ?region skos:narrower* ?locations } ", map:entry("regionQuery","North America")) ! map:get(.,map:keys("location")) return cts:search(/document,cts:triple-range-query((),(),$locations) The $locations sparql resolves all skos:narrower relationships transitively which allows any relationship between country in (US,CA,MX) and if those countries are associated with states and cities which have broader|narrower relationships to countries those conditions would be added to query also. So to summarize, the general goal is to identify list of possible values as well as return documents which represent those values and decouple the concerns of the list to enhance your queries without modelling all these relationships in your content. Using Multi-Modal document and query structure allows alot of flexibility to building rich applications on MarkLogic. TLDR its okay. Your on the right track and hope this helps If you are interested in more thoughts on this subject please feel free to contact me directly. Gary Vidal ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Anyoine seen this error before? : XDMP:FORESTERR
Not sure this helps but you can theoretically get your content back from your forest by using MLCP extract method. This reads the forest directly and you may have enough metadata to do this. https://docs.marklogic.com/guide/mlcp/extract#id_25802 ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] How to structure big XML files for fastest access?
John, You biggest problem is going to be projecting your data from your data source efficiently. The best way is to return exactly what is requested with no transformation or structure data in a way where you partition your doc into a logical query pattern to reduce disk I/O on fetch like by customer or by year. The first thing to address is your structure? Ideally canonical XML will be better than the property/@attribute/value type structure. Avoid attributes at most cases because they rely on a specific accessors which are slower than element accessors. In addition, I find that attributes can cause bleed conditions or false positives without positional indexes because the query may consider the attribute true for one element and true for the textual value of another element. Also you cannot do text searches on attributes without knowing the element the attribute is anchored to so this could prevent full-text searches like cts:word-query. Here are my some short recommendations BAD XML in MarkLogic Some Balue Be careful of this pattern concrete outer element/generic inner element while you can query it correctly, it requires complex wrapping rules and may require positional indexes. cts:element-query(xs:QName("company"),cts:element-value-query(xs:QName("name"),"someValue")) Name CA Author Name MD Prefer If you have N-Occurring patterns then you may have to have positional indexes to ensure name 1 position from state is the same state for given name. BEST Representation is flatten as best as possible. Make every element representative without any flags or complex attribute queries. Avoid to many nested conditions such as /a/b/c/a vs a/b/a. MarkLogic works on pair of elements when it builds query plan. So by reducing the path steps, you save yourself from complex queries and false positives if the path conditions are too deep to MarkLogic to ensure proper filtering. SomeValue Now as for projecting out N columns will be also challenged by if the properties are dynamic, at all costs avoid the following patterns. For Outer For Inner - This will increase your query to x * x since XQuery will not optimize the inner loop. for $r in $rows[1 to ...] return for $col in $r/col[@attribute = "someAttribute"] Avoid // *[@attribute = ] as this is in affect a table scan for every property for every row. Now I am going to assume xquery for your use case and possible solution. The best XQuery rules I can tell you are : - always use the most absolute path to access an element/attribute - if iterating over many columns then make each column call inline as such - Definitely avoid iterating and xdmp:eval or xdmp:value calls. - Avoid inner for flwor for selecting properties over rows. for $row in $rows return element outer-element {$row(col1|col2|col3|col10)} for $row in $rows return element outer-element {$row/col1,$row/col2,$row/col3, $row/col10} Obviously both statements are optimize by xquery engine, but leaves you writing very concrete query patterns per each request. The best way I found to optimize the performance of dynamic query selection to create a evaluated function by concatenating your column selection in a string then calling xdmp:value to return a materialized function. Once the function is created it will optimize itself after repeated calls A trivial example would be : let $results := cts:search(fn:doc(),$some-query-that-returns-results) let $colpaths := ("/foo/bar","/foo/bat","foo/baz") (:Create dynamic function ... Consider what transform you want out and add to function creation like json or project as canonical xml:) let $funct := xdmp:value(fn:concat("function ($result) { element outer-element { $result/(" ,fn:string-join($colpaths,"|"),") }}")) return $results/$func(.) If you find yourself re-using the functions across multiple calls then you may consider using xdmp:get|set-server-field which will store function in app-server field so no need to rebuild across requests. Best way to determine how your query behaves is to use the the profile tab in qconsole and focus on top 5-10 calls also look for frequency of calls that seem exponentially larger than the number of documents you are returning. Hope this helps Regards, Gary Vidal ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] fn:doc() calls cached?
Navdeep, In additional to what Justin mentioned and for TL;DR; You can determine what is being cached vs loaded by using xdmp:query-meters() at the end of your request. Typically every fragment that is part of your evaluation will be in there with either a hit or a miss in any cache. Also certain evaluation is lazily loaded so wrap xdmp:eager(...),xdmp:query-meters() to ensure accurate results. For more information see here https://docs.marklogic.com/xdmp:query-meters ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Parsing json version of search:search query in json using
I would like to file an RFE for this features. So here is the challenge at current I am facing. I have loaded a search configuration called "all". Currently there are no oob apis to access that configuration outside of the built-in apis. I have several custom APIs/transforms that enhance or interact with the query component. So I need to be able to parse the qtext form and the internally from query components in json format. The code Geert shared does most of the plumbing hooks into the internal API's, so my recommendation is to expose those api's that allow for following: 1. get search options by name through rest utility which I can get name from param, then I can call get the textual portion of the query resolved 2. ability parse the json representation of the query to cts.query without any requirement of having access to original options node. Thanks Geert for snippet to get resolved. Regards, Gary Vidal ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Parsing json version of search:search query in json using custom transform in sjs
Thanks Erik, I think this would be a good RFE to file. On another request, is it possible to get the search:options name passed via the rest request and subsequently retrieving the options during the transform phase? Regards, Gary Vidal ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Parsing json version of search:search query in json using custom transform in sjs
All, I am trying to figure out how to parse the query(in json format) so I can get the equivalent cts.query. Is there an api call that will resolve json-query back to cts? Regards Gary ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] MarkLogic 8.0-5 Indenting JSON output
Thanks Justin, I figured that one out about JSON.stringify. Ideally, it would be best to eval to a function which can be reused. Of course safety and performance would be issue. Not sure if V8 connection is kept open during xquery evaluation. Anyways its for a small documentation app, that just pretty prints json examples for view on MarkDown page. I can deal with indented json in a large file of examples. :-) ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] MarkLogic 8.0-5 Indenting JSON output
All, Is there an equivalent to format json output like declare option xdmp:output "indent=yes"; ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Sharing Modules
You can share modules across database by assigning all app-servers to share same modules database. Consider the following folder structure: /global-modules/ /app-1 /app-2 /app-3 If you assign your app-servers to "/" then all apps can share global as well as any other app code. Now maintenance may be the issue and ensuring you dont overwrite global. So assign read/exec to all modules but only assign each applications insert/update role to specific application. ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] fn:current-dateTime() (sweet frd)
fn:current-dateTime() will never change within the same session/transaction. ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] General Digest, Vol 143, Issue 21
You can bind to an external variable that returns the subject outside the binding like so: 'use strict'; //xdmp.documentInsert("/foo.xml",{"triple" : sem.triple(sem.iri('foo'),sem.iri('hasBar'),sem.iri('bar'))}); var sem = require('/MarkLogic/semantics'); var s0 = { 's0': [ sem.iri('foo')] } sem.sparql('SELECT * WHERE { ?s ?p ?o FILTER(?s = ?s0)}',s0) On Sun, May 15, 2016 at 3:00 PM,wrote: > Send General mailing list submissions to > general@developer.marklogic.com > > To subscribe or unsubscribe via the World Wide Web, visit > http://developer.marklogic.com/mailman/listinfo/general > or, via email, send a message with subject or body 'help' to > general-requ...@developer.marklogic.com > > You can reach the person managing the list at > general-ow...@developer.marklogic.com > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of General digest..." > > > Today's Topics: > >1. sem.sparql: get bound placeholders in the result (Florent Georges) > > > -- > > Message: 1 > Date: Sun, 15 May 2016 13:41:47 +0200 > From: Florent Georges > Subject: [MarkLogic Dev General] sem.sparql: get bound placeholders in > the result > To: MarkLogic Developer Discussion > Message-ID: >
Re: [MarkLogic Dev General] Data profiling on large datasets
Alex, I hoisted this code from a project I wrote that does analysis and captures statistics. The goal is to do recursive descent until all nodes are resolved by appending a query to negate visited root nodes, with a cut-off before tree-cache fills up. declare namespace a = a:roots; declare variable $bcount := 0; declare variable $MAX-ITERATIONS := 200; declare variable $base-constraint := cts:and-query(()); declare function a:get-root-elements($bdone,$qnames,$results) { xdmp:set($bcount,$bcount + 1), if($bdone or $bcount $MAX-ITERATIONS) then ( xdmp:log(fn:concat(Starting Root Frequency:,xdmp:elapsed-time()),debug), for $k in $results let $parts := fn:analyze-string($k,\{(.*)\}(.*)) let $ns := fn:string($parts/*:match/*:group[@nr eq 1]) let $local-name := fn:string($parts/*:match/*:group[@nr eq 2]) let $frequency := if($ns eq ) then xdmp:eval( fn:concat(declare variable $base-constraint external;xdmp:estimate(cts:search(/,$local-name,,($base-constraint),('unfiltered', (fn:QName(,base-constraint),$base-constraint) ) else xdmp:eval( fn:concat(declare namespace _1 = ,$ns,; declare variable $base-constraint external; xdmp:estimate(cts:search(/_1:,$local-name,,$base-constraint))), (fn:QName(,base-constraint),$base-constraint) ) where $frequency 0 return ((: root-element typeelement/type database{xdmp:database()}/database id{xdmp:md5($k)}/id namespace{$ns}/namespace localname{$local-name}/localname frequency{$frequency}/frequency /root-element :) xdmp:key-from-QName(fn:QName($ns,$local-name)) ), xdmp:log(fn:concat(Finished Root Frequency:,xdmp:elapsed-time()),debug) ) else let $constraint := if(fn:exists($qnames)) then for $qn in $qnames return cts:not-query(cts:element-query($qn,cts:and-query(( else () let $rnode := if(fn:not(fn:empty($qnames))) then fn:subsequence(cts:search(/element(),cts:and-query(($base-constraint,$constraint)),unfiltered),1,1) else fn:subsequence(cts:search(/element(),cts:and-query(()),unfiltered),1,1) return if($rnode instance of element() and fn:not(fn:node-name($rnode) = $qnames)) then let $qname := fn:node-name($rnode) let $key := fn:concat({,fn:namespace-uri($rnode),},fn:local-name($rnode)) return ( a:get-root-elements(fn:false(),($qnames,$qname),($key,$results)) ) else if(fn:node-name($rnode) = $qnames) then a:get-root-elements(fn:true(),$qnames,$results) else if(fn:empty($rnode)) then a:get-root-elements(fn:true(),$qnames,$results) else a:get-root-elements(fn:false(),$qnames,$results) }; a:get-root-elements(fn:false(),(),()) ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Getting all relation paths between two entities in Semantic
You can use the sem:transitive-closure function from the sem library to walk the graph relations where s = o and o=s. From there you can collect all the subjects and use that as a query. http://docs.marklogic.com/sem:transitive-closure xquery version 1.0-ml; import module namespace sem = http://marklogic.com/semantics; at /MarkLogic/semantics.xqy; sem:transitive-closure(sem:iri(http://www.w3.org/People/Berners-Lee/card#i;), sem:iri(http://xmlns.com/foaf/0.1/knows;),9) Another option is the my transitive library that I am actively maintaining on github that provides some higher level API aligned with XPATH like function calls https://github.com/garyvidal/ml-libraries/tree/master/transitive Sir, Is there any way we can get the paths between two entities in a semantic graph. For example, If a isA alphabet , b isA alphabet then the path is a - alphabet - b. Same way for a partOf OddWord partOf AllWords , a partOf EvenWord partOf AllWords the path is a - OddWord - AllWords - EvenWord b Any way we can use semantic query construct to fetch the path like this.. It is a finding paths in a graph as we studies in Data Structures with the context of semantic graphs. ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] What about XQuery proper?
Hello Xavier Masson, Thanks for asking the tough questions and I will try to answer as best I can inline. We are currently evaluating ML, first as a XML database supporting XQuery. We already have one and a complex but not large set of data and would like to benefit from ML rich features. ML seems an amazing database and functionally bring a lot to the table. The recent communications around ML 8 and the evolutions in the recent versions are making me question the support of standard XQuery. More precisely the absence of support of XQuery 3.0 beyond the prefix is a bit puzzling (I am thinking mostly about group_by clauses). Said otherwise this post http://markmail.org/message/fpsimswbt3gteooj from 2012 seems to still be relevant and did not get any answer ;). In fact MarkLogic supports many of the functionalities in XQuery 3.0 including Supported: * try/catch expressions (3.15 Try/Catch Expressionshttp://www.w3.org/TR/xquery-30/#id-try-catch). Actually supported well before the standard was ratified * Dynamic function call (3.2.2 Dynamic Function Callhttp://www.w3.org/TR/xquery-30/#id-dynamic-function-invocation ) * Inline function expressions (3.1.7 Inline Function Expressionshttp://www.w3.org/TR/xquery-30/#id-inline-func). * Private functions (4.18 Function Declarationhttp://www.w3.org/TR/xquery-30/#FunctionDeclns). * Switch expressions (3.13 Switch Expressionhttp://www.w3.org/TR/xquery-30/#id-switch). * Computed namespace constructors (3.9.3.7 Computed Namespace Constructorshttp://www.w3.org/TR/xquery-30/#id-computed-namespaces). * Output declarations (2.2.4 Serializationhttp://www.w3.org/TR/xquery-30/#id-serialization). * Annotations (4.15 Annotationshttp://www.w3.org/TR/xquery-30/#id-annotations). * Function assertionshttp://www.w3.org/TR/xquery-30/#dt-function-assertion in function testshttp://www.w3.org/TR/xquery-30/#doc-xquery30-FunctionTest. * A string concatenation operator (3.6 String Concatenation Expressionshttp://www.w3.org/TR/xquery-30/#id-string-concat-expr). * A mapping operator (3.17 Simple map operator (!)http://www.w3.org/TR/xquery-30/#id-map-operator). What is not supported and I will shed light on why * group by clause in FLWOR Expressions (3.10.7 Group By Clausehttp://www.w3.org/TR/xquery-30/#id-group-by). * count clause in FLWOR Expressions (3.10.6 Count Clausehttp://www.w3.org/TR/xquery-30/#id-count). * tumbling window and sliding window in FLWOR Expressions (3.10.4 Window Clausehttp://www.w3.org/TR/xquery-30/#id-windows). * allowing empty in 3.10.2 For Clausehttp://www.w3.org/TR/xquery-30/#id-xquery-for-clause, for functionality similar to outer joins in SQL. Group by/count: Understand that the group by/count clause, implementation is resource intensive operation and not something that should be supported loosely unless there is a clear way to scale it as a database operation. The engine has some support for group-by operations using cts:value-tuples, but requires indexes to ensure optimal operation in MarkLogic. The operation to perform a group by across a large dataset would run into scaling issues without the proper index structures to support it. There may be some future support for this as SPARQL gains this ability to support group by. But the indexing structures would need to be more aligned with a database vs how MarkLogic indexes content for search and retrieval. Tumbling/Sliding Window /allowing empty For the other features not mentioned, those are generally less known and although very useful for small cases, not primarily on the radar from a necessity/use case perspective. Just my 2 cents. I fully understand and appreciate that ML needs to move forward into buzzland from JSON (good) to haddoop and BIgData (who really has a truly 'big data' dataset ? ;) ), even throwing that abomination that is javascript into the mix ( ;) ) but does that mean that XML and XQUERY support will stop evolving and be engine level /internals stuff (because I understand that ML is still at its score a XML/document database ) ? Yes, Javascript is like putting bumper stickers on a Bentley, but allows the bumper sticker citizens to play with a powerful engine and technology. The fact we use V8 shows that we have great performance for javascript to leverage MarkLogic, if you so choose. But what it does do is open up the integration of complex algorithms already supported in javascript, that would not be available to XQuery as a community. 2) I am also a bit troubled by the constant cts/xdmp:hack that seems to be the only way of getting performance out of ML. I know that this kind of custom methods is supported by the W3C specs but my problem is that it seems to be the complete and constant substitute for FLOWER. I can understand such an escape hatch for specific optimizations (the last 20% perf ;)) or to access functionality outside of the standard
Re: [MarkLogic Dev General] Marklogic XSLT transformer returning file without any tags
It would appear you do not have a template that matches / or you do not have a template to match every element. If you are doing an identity transform the simplest template should match and do a copy of the nodes that don't match any other template. xsl:stylesheet .. version=2.0 !-- Other Templates here -- xsl:template match=@*|node xsl:copy xsl:appy-templates/ /xsl:copy /xsl:template /xsl:stylesheet Another issue is that $doc may not be a document node but an element node. Always wrap your node in document {} like xdmp:xslt-invoke(example.xsl, if($node instance of document-node() then $doc else document {$doc}, (), ())* -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of general-requ...@developer.marklogic.com Sent: Monday, November 24, 2014 3:00 PM To: general@developer.marklogic.com Subject: General Digest, Vol 125, Issue 38 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to general-requ...@developer.marklogic.com You can reach the person managing the list at general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than Re: Contents of General digest... Today's Topics: 1. Marklogic XSLT transformer returning file without any tags (S.Gowtham) 2. Re: Marklogic XSLT transformer returning file without any tags (Christopher Hamlin) 3. Re: Marklogic XSLT transformer returning file without any tags (Florent Georges) -- Message: 1 Date: Mon, 24 Nov 2014 18:58:08 +0800 From: S.Gowtham gowti1...@gmail.com Subject: [MarkLogic Dev General] Marklogic XSLT transformer returning filewithout any tags To: general@developer.marklogic.com Message-ID: CAGDVLc3oZui07c4cQ1aT0nv0fHvoGwAZ1VmS=jq2ntr-upf...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 Hi all, I invoked below transformation query to transform the xml with xslt. *xdmp:xslt-invoke(example.xsl, $doc, (), ())* In example.xsl contains below contents. *xsl:template match=CaseRef* * xsl:variable xmlns:encoder=xalan://java.net.URLEncoder name=urlEncodedCit* * select=encoder:encode( substring( string(@href),2))/* *xsl:value-of select=$urlEncodedCit/* */xsl:template* But the marklogic query console returned file without any tags. It seems to me that, the file is not transformed correctly. Can anyone help me to solve the issue. Thanks, Best Regards, $.Gowth@m -- next part -- An HTML attachment was scrubbed... URL: http://developer.marklogic.com/pipermail/general/attachments/20141124/f5919d7a/attachment-0001.html -- Message: 2 Date: Mon, 24 Nov 2014 09:05:39 -0500 From: Christopher Hamlin cbham...@gmail.com Subject: Re: [MarkLogic Dev General] Marklogic XSLT transformer returning file without any tags To: MarkLogic Developer Discussion general@developer.marklogic.com Message-ID: cadx+0qw0a3sms2orf00pxdexujb4bppfcidhyt5yzlgnis3...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 Hi, If that is the complete contents of the xsl, then what you should get back is all the text in the source file with the href attribute text from the CaseRef tags. xslt has default templates that would match anything but CaseRef, but they only return text. Check the top answer here for an explanation: http://stackoverflow.com/questions/3360017/why-does-xslt-output-all-text-by-default /ch On Mon, Nov 24, 2014 at 5:58 AM, S.Gowtham gowti1...@gmail.com wrote: Hi all, I invoked below transformation query to transform the xml with xslt. xdmp:xslt-invoke(example.xsl, $doc, (), ()) In example.xsl contains below contents. xsl:template match=CaseRef xsl:variable xmlns:encoder=xalan://java.net.URLEncoder name=urlEncodedCit select=encoder:encode( substring( string(@href),2))/ xsl:value-of select=$urlEncodedCit/ /xsl:template But the marklogic query console returned file without any tags. It seems to me that, the file is not transformed correctly. Can anyone help me to solve the issue. Thanks, Best Regards, $.Gowth@m ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general -- Message: 3 Date: Mon, 24 Nov 2014 16:36:30 +0100 From: Florent Georges li...@fgeorges.org Subject: Re: [MarkLogic Dev General] Marklogic XSLT transformer returning file without any tags To: MarkLogic Developer Discussion general@developer.marklogic.com Message-ID:
Re: [MarkLogic Dev General] General Digest, Vol 124, Issue 76
Gary, As Justin suggested create a path range index on the values is the most optimal You can then use cts:values to and just add range-query constraint or even pass in start argument like cts:values( cts:path-reference(xs:QName(/path-to-price/price,type=decimal), 1000.00, (), cts:element-range-query(/path-to-price/price,=,1000.00) ) Regards Gary Vidal ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Passing empty value to function not working
You can always try using cardinality ?. This will allow you to pass (0,1) arguments to function. Of course you have check the value with fn:exists($sessionId) and handle case accordingly xutils:validateSession($sessionId as xs:string?) {..} -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of general-requ...@developer.marklogic.com Sent: Friday, September 26, 2014 8:08 AM To: general@developer.marklogic.com Subject: General Digest, Vol 123, Issue 50 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to general-requ...@developer.marklogic.com You can reach the person managing the list at general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than Re: Contents of General digest... Today's Topics: 1. Re: Passing empty value to function not working (David Ennis) 2. Re: Passing empty value to function not working (David Lee) -- Message: 1 Date: Fri, 26 Sep 2014 13:14:50 +0200 From: David Ennis david.en...@hinttech.com Subject: Re: [MarkLogic Dev General] Passing empty value to function not working To: MarkLogic Developer Discussion general@developer.marklogic.com Message-ID: cabbx6anv9uud8t9jyb5ydas5w3ywszvemd0dtyjwupcfozn...@mail.gmail.com Content-Type: text/plain; charset=utf-8 ?Hi MarkLogic allows for function overloading. Perhaps that could be your solution.. Kind Regards, David Ennis? Kind Regards, David Ennis David Ennis *Content Engineer* [image: HintTech] http://www.hinttech.com/ Mastering the value of content creative | technology | content Delftechpark 37i 2628 XJ Delft The Netherlands T: +31 88 268 25 00 M: +31 63 091 72 80 [image: http://www.hinttech.com] http://www.hinttech.com https://twitter.com/HintTech http://www.facebook.com/HintTech http://www.linkedin.com/company/HintTech On 26 September 2014 13:00, Kapoor, Pragya pkapo...@innodata.com wrote: Hi, I need to pass an empty value to util function, which is not working. declare function xutils:validateSession($sessionId as xs:string) { if($sessionId ne ) then let $document := fn:doc($config:USER_SESSIONS) errorCode code500/code descriptionserver error/description /errorCode else errorCode code517/code descriptionSession Id can't be empty/description /errorCode }; I am calling this function from some other file and it is returing an empty sequence.But if I make this function local:validateSession($sessionId as xs:string) then its working. Calling file: import module namespace config = config at /rest-apis/utils/config.xqy; let $node := 'inputString sessionId/sessionId /inputString' let $node := xdmp:unquote($node) let $sessionId := $node//sessionId/text() ?return let $validateFlag := xutils:validateSession($sessionId) return $validateFlag Please advice. Thanks Pragya This e-mail and any attachments transmitted with it are for the sole use of the intended recipient(s) and may contain confidential , proprietary or privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this e-mail or any action taken in reliance on this e-mail is strictly prohibited and may be unlawful. ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general -- next part -- An HTML attachment was scrubbed... URL: http://developer.marklogic.com/pipermail/general/attachments/20140926/2461825e/attachment-0001.html -- Message: 2 Date: Fri, 26 Sep 2014 12:08:06 + From: David Lee david@marklogic.com Subject: Re: [MarkLogic Dev General] Passing empty value to function not working To: MarkLogic Developer Discussion general@developer.marklogic.com Message-ID: 6ad72d76c2d6f04d8be471b70d4b991e04c9b...@exchg10-be02.marklogic.com Content-Type: text/plain; charset=utf-8 Almost certainly this is due to xdmp:mapping being on https://docs.marklogic.com/guide/xquery/enhanced#id_55459 https://docs.marklogic.com/guide/xquery/langoverview#id_45023 For some this is an incredible feature. For others its surprising and difficult to debug . Its ON by default. In short, what it does is call functions that have an argument that can take exactly 1 of
Re: [MarkLogic Dev General] false match on cts:element-value-query
Gary, I think the problem is due to the tokenization of the value. If you can cts:tokenize against the string you will determine which characters are treated as tokens. declare namespace html = http://www.w3.org/1999/xhtml;; let $obj := json:object() return ( cts:tokenize([Folders].[Orders].[OrderDate]) ! map:put($obj,.,xdmp:describe(.)), $obj ) Returns { [: cts:punctuation([), Folders: cts:word(Folders), ]: cts:punctuation(]), .: cts:punctuation(.), Orders: cts:word(Orders), OrderDate: cts:word(OrderDate) } let $node := ap[Folders].[Orders].[OrderDate]/p/a return cts:contains($node,cts:element-value-query(xs:QName(p),Folders.Orders.OrderDate,(punctuation-insensitive))) [returns] True let $node := ap[Folders].[Orders].[OrderDate]/p/a return cts:contains($node,cts:element-value-query(xs:QName(p),Folders.Orders.OrderDate,(punctuation-sensitive))) [returns] False Starting from ML7 you can control the tokenization of string values and how they are treated http://docs.marklogic.com/admin:database-tokenizer-override http://docs.marklogic.com/guide/search-dev/custom-tokenization#id_80979 ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] General Digest, Vol 116, Issue 2
In furtherance of Mary's post here is a function that I put together a while back to do the same thing. declare function validate-schema-inline( $document as node(), $schema-uris as xs:string*, $mode as xs:string ) { let $xsl := xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform; xmlns:xs=http://www.w3.org/2001/XMLSchema; xmlns:xdmp=http://marklogic.com/xdmp; xmlns:map=http://marklogic.com/map; xmlns:error=http://marklogic.com/xdmp/error; extension-element-prefixes=xdmp map version=2.0 xsl:output method=xml indent=yes / { for $uri in $schema-uris return xsl:import-schema{fn:doc($uri)}/xsl:import-schema } xsl:template match=/ xsl:apply-templates/ /xsl:template xsl:template match=ns1:* xsl:copy-of select=. validation={$mode}/ /xsl:template /xsl:stylesheet return xdmp:xslt-eval(document {$xsl}, if($document instance of document-node()) then $document else document {$document} ) }; Gary Vidal Principal Consultant MarkLogic Corporation gary.vi...@marklogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.com -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of general-requ...@developer.marklogic.com Sent: Tuesday, February 04, 2014 1:23 PM To: general@developer.marklogic.com Subject: General Digest, Vol 116, Issue 2 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to general-requ...@developer.marklogic.com You can reach the person managing the list at general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than Re: Contents of General digest... Today's Topics: 1. Post-read trigger? (Geert J.) 2. schema validation (Whitby, Rob) 3. Re: schema validation (Geert J.) 4. Re: schema validation (Anthony Coates) 5. Re: schema validation (Mary Holstege) -- Message: 1 Date: Tue, 4 Feb 2014 08:29:02 +0100 From: Geert J. geert.jos...@dayon.nl Subject: [MarkLogic Dev General] Post-read trigger? To: MarkLogic Developer Discussion general@developer.marklogic.com Message-ID: a361885dfe66447a6fce670284277...@mail.gmail.com Content-Type: text/plain; charset=ISO-8859-1 Hi, Would a post-read trigger (only at accessing a document explicitly using doc()) make sense? For instance to update a view-count in the document properties. Saves one from spawning a task oneself.. Cheers M.Sc. G.P.H. (Geert) Josten Senior Developer Dayon B.V. Delftechpark 37b 2628 XJ Delft The Netherlands T +31 (0)88 26 82 570 geert.jos...@dayon.nl www.dayon.nl De informatie - verzonden in of met dit e-mailbericht - is afkomstig van Dayon BV en is uitsluitend bestemd voor de geadresseerde. Indien u dit bericht onbedoeld hebt ontvangen, verzoeken wij u het te verwijderen. Aan dit bericht kunnen geen rechten worden ontleend. -- Message: 2 Date: Tue, 4 Feb 2014 12:55:24 + From: Whitby, Rob rob.whi...@springer.com Subject: [MarkLogic Dev General] schema validation To: MarkLogic Developer Discussion general@developer.marklogic.com Message-ID: cf1693ba.11920%rob.whi...@springer.com Content-Type: text/plain; charset=windows-1252 Hi, I can?t seem to figure out how to use a schema without loading it into the schemas db. Is something like this possible? let $xml := foo? let $schema := xs:schema? let $validate := ??? Thanks Rob -- next part -- An HTML attachment was scrubbed... URL: http://developer.marklogic.com/pipermail/general/attachments/20140204/7d3e1f73/attachment-0001.html -- Message: 3 Date: Tue, 4 Feb 2014 15:10:12 +0100 From: Geert J. geert.jos...@dayon.nl Subject: Re: [MarkLogic Dev General] schema validation To: MarkLogic Developer Discussion general@developer.marklogic.com Message-ID: dc8dac7a69836fa0b155e7f36c877...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 Hi Rob, As far as I know, no. But you can specify a different database as Schemas database. It should in principle be possible to select a docs database itself as Schemas database.. Kind regards, Geert *Van:* general-boun...@developer.marklogic.com [mailto: general-boun...@developer.marklogic.com] *Namens *Whitby, Rob *Verzonden:* dinsdag 4 februari 2014 13:55 *Aan:* MarkLogic Developer Discussion *Onderwerp:* [MarkLogic Dev General] schema validation Hi, I can't seem to figure out how to use a schema without loading it into the schemas db. Is something like
Re: [MarkLogic Dev General] Size of an index
Ravinder, Here is a partial implementation of how to calculate on-disk size of each index. There are some issues I have not figured out to resolve all index types(geospatial) but should give you a good sense of the size of each index (it assumes all data is in a single host), but will look into each forest directory and sum the size. If interested I will post to github for further updates. What it returns is something like this: range-index-statistics all-indexes-size 789680 /all-indexes-size element-indexes-size 747608 /element-indexes-size attribute-indexes-size 0 /attribute-indexes-size field-indexes-size 0 /field-indexes-size path-indexes-size 42072 /path-indexes-size range-indexes element-range-index key 6ac11acd756cd4da-string /key namespace-uri http://marklogic.com/content-analyzer /namespace-uri localname attribute-localname /localname collation http://marklogic.com/collation/codepoint /collation type string /type size 2048 /size file-count 4 /file-count /element-range-index element-range-index key b6ae2f8eb8298059-string /key namespace-uri http://marklogic.com/content-analyzer /namespace-uri localname attribute-namespace /localname collation http://marklogic.com/collation/codepoint /collation type string /type size 2048 /size file-count 4 /file-count /element-range-index element-range-index key a2e0f65bb0efec5a-string /key namespace-uri http://marklogic.com/content-analyzer /namespace-uri localname child-localname /localname collation http://marklogic.com/collation/codepoint /collation type string /type size 13064 /size file-count 4 /file-count /element-range-index element-range-index key eece0b1cf3ac97d9-string /key namespace-uri http://marklogic.com/content-analyzer /namespace-uri localname child-namespace /localname collation http://marklogic.com/collation/codepoint /collation type string /type size 13064 /size file-count 4 /file-count /element-range-index element-range-index key 96bf2bc6589a5bb2-dateTime /key Regards, Gary Vidal Media Consultant MarkLogic Corporation gary.vi...@marklogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.com -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of general-requ...@developer.marklogic.com Sent: Tuesday, January 28, 2014 3:00 PM To: general@developer.marklogic.com Subject: General Digest, Vol 115, Issue 32 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to general-requ...@developer.marklogic.com You can reach the person managing the list at general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than Re: Contents of General digest... Today's Topics: 1. Re: Size of an index (RAVINDER MAAN) (Paul M) -- Message: 1 Date: Mon, 27 Jan 2014 10:28:43 -0800 (PST) From: Paul M pjm...@yahoo.com Subject: Re: [MarkLogic Dev General] Size of an index (RAVINDER MAAN) To: general@developer.marklogic.com general@developer.marklogic.com Message-ID: 1390847323.40565.yahoomail...@web125505.mail.ne1.yahoo.com Content-Type: text/plain; charset=iso-8859-1 What tasks are you trying to accomplish that requires knowing the size of an index on disk? There may be? solutions other than definitive size of index. -Paul From: general-requ...@developer.marklogic.com general-requ...@developer.marklogic.com To: general@developer.marklogic.com Sent: Saturday, January 25, 2014 3:00 PM Subject: General Digest, Vol 115, Issue 31 Send General mailing list submissions to ??? general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit ??? http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to ??? general-requ...@developer.marklogic.com You can reach the person managing the list at ??? general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than Re: Contents of General digest... Today's Topics: ? 1. Size of an index (RAVINDER MAAN) ? 2. Re: Size of an index (Danny Sokolsky) ? 3. Re: Size of an index (Geert J.) ? 4. Re: Size of an index (Michael Blakeley) -- Message: 1 Date: Fri, 24 Jan 2014 19:41:27 + From: RAVINDER MAAN rsmaan...@gmail.com Subject: [MarkLogic Dev General] Size of an index To: general@developer.marklogic.com Message-ID: ??? CAFuTbAJ=m7g+YHYS-p3ztii16GK8Bxem=cs6wz0zrmkhqrb...@mail.gmail.com Content-Type: text/plain; charset=iso-8859-1 Hi all Is threre any way to find the size of an index on disk? Thanks -- next
Re: [MarkLogic Dev General] cts query with validation to check
Not sure what version you are using but many of the element-xxx-functions support min-occurs=n max-occurs=n options. So you can use the following to test out the functionality xquery version 1.0-ml; declare namespace html = http://www.w3.org/1999/xhtml;; xdmp:document-insert(/test/progs/1.xml,docprogramA/programprogramB/program/doc); xdmp:document-insert(/test/progs/2.xml,docprogramA/program/doc); cts:search(xdmp:directory(/test/,infinity),cts:element-value-query(xs:QName(program),*,(wildcarded,max-occurs=1))) RETURNS doc programA/program /doc Gary Vidal Media Consultant MarkLogic Corporation gary.vi...@marklogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.com ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] deleteing duplicate elements from XMLs in Marklogic
Nikin, A very simple wat dedupe is to use maps. The important thing is to clear the map after each parent traversal to ensure each subchild of b is unique not unique across entire document. And also when updating nodes in MarkLogic it is important that you not try to update/delete a child node of a parent node who is also updated. So it is better to traverse the whole document or a portion of the document such that your updates constitutes a single update within the document. This will help avoid conflicting updates. [Query Console] xquery version 1.0-ml; declare namespace local = urn:local; declare option xdmp:mapping false; declare function local:prune-unique( $context, $push-map ) { if($context/@href) then if(map:get($push-map,fn:normalize-space($context/@href))) then () else (map:put($push-map,fn:normalize-space($context/@href),$context),$context) else if($context/element()) then element {fn:node-name($context)} { $context/@*, for $node in $context/element() return local:prune-unique($node,$push-map), map:clear($push-map) } else $context }; let $nodes := a b c href=input1/ c href=input2/ c href=input1/ c href=input1/ c href=input1/ c href=input3/ c href=input3/ c href=input1/ c href=input1/ c href=input1/ /b b c href=input1/ c href=input2/ c href=input1/ c href=input1/ c href=input1/ /b /a return (:Assuming $nodes is pulled from database:) xdmp:node-replace($nodes,local:prune-unique($nodes,map:map())) returns a b c href=input1/ c href=input2/ c href=input3/ /b b c href=input1/ c href=input2/ /b /a Gary Vidal Media Consultant MarkLogic Corporation gary.vi...@marklogic.commailto:gary.vi...@markogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.comhttp://www.marklogic.com/ ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] deleteing duplicate elements from XMLs in Marklogic
Nikin, A very simple way dedupe is to use maps. The important thing is to clear the map after each parent traversal to ensure each subchild of b is unique not unique across entire document. And also when updating nodes in MarkLogic it is important that you not try to update/delete a child node of a parent node who is also updated. So it is better to traverse the whole document or a portion of the document such that your updates constitutes a single update within the document. This will help avoid conflicting updates. [Query Console] xquery version 1.0-ml; declare namespace local = urn:local; declare option xdmp:mapping false; declare function local:prune-unique( $context, $push-map ) { if($context/@href) then if(map:get($push-map,fn:normalize-space($context/@href))) then () else (map:put($push-map,fn:normalize-space($context/@href),$context),$context) else if($context/element()) then element {fn:node-name($context)} { $context/@*, for $node in $context/element() return local:prune-unique($node,$push-map), map:clear($push-map) } else $context }; let $nodes := a b c href=input1/ c href=input2/ c href=input1/ c href=input1/ c href=input1/ c href=input3/ c href=input3/ c href=input1/ c href=input1/ c href=input1/ /b b c href=input1/ c href=input2/ c href=input1/ c href=input1/ c href=input1/ /b /a return (:Assuming $nodes is pulled from database:) xdmp:node-replace($nodes,local:prune-unique($nodes,map:map())) returns a b c href=input1/ c href=input2/ c href=input3/ /b b c href=input1/ c href=input2/ /b /a Gary Vidal Media Consultant MarkLogic Corporation gary.vi...@marklogic.commailto:gary.vi...@markogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.comhttp://www.marklogic.com/ ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] roxy and maven
Have you tried using the MarkLogic Ant Tasks? Since its closer to its Maven brethren it could be used inside a javaproc. http://github.com/garyvidal/marklogic-ant-tasks Gary Vidal Media Consultant MarkLogic Corporation gary.vi...@marklogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.com * ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Highlighting query
Pragya, If you want full phrase matches where longest string match wins then you will probably want to sort each term in length order and iterate over each phrase or word and highlight like noted except with may just constructed elements. But he order in which you execute them will be the difference. So if you highlight terms longest to shortest then you have better chance to match longer terms over shorter terms. In your highlight function you may want to check if the node you are highlighting has already been highlighted by looking at the cts:node variable and stepping up the ancestor path like below. if(fn:exists($cts:node/ancestor-or-self::html:span)) then html:span{$cts:text}/html:span else $cts:text ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] weightage in xml tags for search
Sundar, The simplest approach would be to configure fields in your database. This will allow you to associate multiple element or element/attribute names to a single name structure like title or body. Each element can contribute a weighting which is applied at the time of indexing. If you have already existing database with content then you will have to pay the cost of reindexing the content to add the field indexes. More information on fields http://docs.marklogic.com/guide/admin/fields#chapter you can then use a field constraint in your search:search options to define the fields. Date: Wed, 20 Nov 2013 12:55:25 + From: Sundaravadivel Kandasamy sundaravadive...@infosys.com Subject: [MarkLogic Dev General] weightage in xml tags for search To: general@developer.marklogic.com general@developer.marklogic.com Message-ID: 5FB7D0B311DDFE439CD54FA60898F802122FC0F2@chnshlmbx14 Content-Type: text/plain; charset=us-ascii Hi, I want to set the weightage for 'body' and 'title' element in xml, xml content will have other elements than body and title. I am using search API for search. I am using search API for search and we have five different xml contents with different namespaces for title and body element. Is there any good approach to achieve this weightage? cts:element-word-query to set the weightage may not help. Thanks. Regards, Sundar. ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Highlighting query
let $doc := p Google Inc. is an American multinational corporation specializing in Internet-related services and products. These include search, cloud computing, software and online advertising technologies.[7] Most of its profits are derived from AdWords.[8][9] Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University. Together they own about 16 percent of its shares. They incorporated Google as a privately held company on September 4, 1998. An initial public offering followed on August 19, 2004. Its mission statement from the outset was to organize the world's information and make it universally accessible and useful,[10] and its unofficial slogan was Don't be evil.[11][12] In 2006 Google moved to headquarters in Mountain View, California, nicknamed the Googleplex. Rapid growth since incorporation has triggered a chain of products, acquisitions and partnerships beyond Google's core search engine. It offers online productivity software including email (Gmail), an office suite (Google Drive), and social networking (Google+). Desktop products include applications for web browsing, organizing and editing photos, and instant messaging. The company leads the development of the Android mobile operating system and the browser-only Chrome OS[13] for a specialized type of netbook known as a Chromebook. Google has moved increasingly into communications hardware: it partners with major electronics manufacturers in production of its high-end Nexus devices and acquired Motorola Mobility in May 2012.[14] In 2012, a fiber-optic infrastructure was installed in Kansas City to facilitate a Google Fiber broadband service.[15] The corporation has been estimated to run more than one million servers in data centers around the world[16] and to process over one billion search requests[17] and about 24 petabytes of user-generated data each day.[18][19][20][21] In December 2012 Alexa listed google.com as the most visited website in the world. Numerous Google sites in other languages figure in the top one hundred, as do several other Google-owned sites such as YouTube and Blogger.[22] Its market dominance has led to criticism over issues including copyright, censorship, and privacy.[23][24] /p let $tmpdoc := $doc let $map := map:map() let $_ := ( map:put($map,blue,(Google,Google Inc.,YouTube, Stanford University,Motorola)), map:put($map,red,(Larry Page,Sergey Brin)), map:put($map,green,(Android,Chrome,Google+,Gmail)) ) let $highlight := for $key in map:keys($map) let $terms := map:get($map,$key) return xdmp:set($tmpdoc,cts:highlight($tmpdoc,cts:or-query($terms ! cts:word-query(.)),span style=color:{$key}{$cts:text}/span)) return $tmpdoc ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Highlighting query
David, More than likely is due to how the word-queries are resolved and phrase-through elements defined on the database. By changing the namespace to html and assuming the database running the query use phrase throughs for spans resolves this issue. Remember that if the blue elements run before green then a boundary is created that causes the span to act as a word-boundary. By tweaking the query a bit and adding html namespaces, you see that it will resolve correctly against the Documents database. Please make sure you have phrase-throughs enabled on your database if you find the results don't come out as expected. Another technique is to just put an anchor at the end of the term as an icon so that you don't get a href/ overlaps So you could do something like this tmpdoc,cts:or-query($terms ! cts:word-query(.,punctuation-sensitive)),($cts:text ,span xmlns=http://www.w3.org/1999/xhtml; class=person or thing style=color:{$key}/span))) Here is the corrected code and results let $doc := p xmlns=http://www.w3.org/1999/xhtml; Google Inc. is an American multinational corporation specializing in Internet-related services and products. These include search, cloud computing, software and online advertising technologies.[7] Most of its profits are derived from AdWords.[8][9] Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University. Together they own about 16 percent of its shares. They incorporated Google as a privately held company on September 4, 1998. An initial public offering followed on August 19, 2004. Its mission statement from the outset was to organize the world's information and make it universally accessible and useful,[10] and its unofficial slogan was Don't be evil.[11][12] In 2006 Google moved to headquarters in Mountain View, California, nicknamed the Googleplex. Rapid growth since incorporation has triggered a chain of products, acquisitions and partnerships beyond Google's core search engine. It offers online productivity software including email (Gmail), an office suite (Google Drive), and social networking (Google+). Desktop products include applications for web browsing, organizing and editing photos, and instant messaging. The company leads the development of the Android mobile operating system and the browser-only Chrome OS[13] for a specialized type of netbook known as a Chromebook. Google has moved increasingly into communications hardware: it partners with major electronics manufacturers in production of its high-end Nexus devices and acquired Motorola Mobility in May 2012.[14] In 2012, a fiber-optic infrastructure was installed in Kansas City to facilitate a Google Fiber broadband service.[15] The corporation has been estimated to run more than one million servers in data centers around the world[16] and to process over one billion search requests[17] and about 24 petabytes of user-generated data each day.[18][19][20][21] In December 2012 Alexa listed google.com as the most visited website in the world. Numerous Google sites in other languages figure in the top one hundred, as do several other Google-owned sites such as YouTube and Blogger.[22] Its market dominance has led to criticism over issues including copyright, censorship, and privacy.[23][24] /p let $tmpdoc := $doc let $map := map:map() let $_ := ( map:put($map,blue,(Google,Google Inc.,YouTube, Stanford University,Motorola)), map:put($map,red,(Larry Page,Sergey Brin)), map:put($map,green,(Android,Chrome,Google+,Gmail)) ) let $highlight := for $key in map:keys($map) let $terms := map:get($map,$key) return xdmp:set($tmpdoc,cts:highlight($tmpdoc,cts:or-query($terms ! cts:word-query(.,punctuation-sensitive)),span xmlns=http://www.w3.org/1999/xhtml; style=color:{$key}{$cts:text}/span)) return $tmpdoc [Returns] ...and social networking (span style=color:bluespan style=color:greenGoogle/span/spanspan style=color:green+/span). ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Sorting in Search API
Sundaravadivel, Sorting be least relevance is a bad move especially over large result sets. Consider that if you have 1M records that it will need to compute scores for 1M records before it can return a sort order. Could you give the use case why you would need results with lower scores first? If that is the case then it would be better to invert the search using a not-query($your-query) then work from there. Gary Vidal Media Consultant MarkLogic Corporation gary.vi...@marklogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.com ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] export / import ?
xdmp:spawn will not return results unless you explicitly ask for it back in options node of the spawn. The docs show an example of the spawn returning a result. http://docs.marklogic.com/xdmp:spawn#spawnresultex let $x := xdmp:spawn(/oneplusone.xqy, (), options xmlns=xdmp:eval result{fn:true()}/result /options ) return ($x + 2) Gary Vidal Media Consultant MarkLogic Corporation gary.vi...@marklogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.com ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] XDMP-FORESTNOT-- error occurred while re-indexing the
Abishek, You may want to remove the label files in the forest directories that are having issue. 1. Since the forests are not available you should be able to rename the label files noted as label.bad or move out of directory. 2. From Admin console restart the forest. You will note the system recreates the label file and the forest should be available again. Gary Vidal Media Consultant MarkLogic Corporation gary.vi...@marklogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.com -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of general-requ...@developer.marklogic.com Sent: Tuesday, November 05, 2013 4:02 AM To: general@developer.marklogic.com Subject: General Digest, Vol 113, Issue 8 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to general-requ...@developer.marklogic.com You can reach the person managing the list at general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than Re: Contents of General digest... Today's Topics: 1. XDMP-FORESTNOT-- error occurred while re-indexing the database after upgrading marklogic from 5 to 6 (abhinav.mish...@cognizant.com) -- Message: 1 Date: Tue, 5 Nov 2013 07:29:08 + From: abhinav.mish...@cognizant.com Subject: [MarkLogic Dev General] XDMP-FORESTNOT-- error occurred while re-indexing the database after upgrading marklogic from 5 to 6 To: general@developer.marklogic.com Message-ID: 2790464791e54b44a00fbe4b0e85218206d4e...@ctsinchnsxmbv.cts.com Content-Type: text/plain; charset=us-ascii Hi All, We have upgraded MarkLogic server from 5 to 6.0.4. After upgrade we were re-indexing the database and after sometimes we found an error as XDMP-FORESTNOT: Forest pce not available: XDMP-FORESTERR: Error in checkpoint of forest pce: SVC-FILWRT: File write error: open '/local/Marklogic/Forests/pce/00013708/Label': No such file or directory [1.0-ml] while accessing the WebServices (REST Apis). When checked the status of the forest , then its status was: There is currently an XDMP-FORESTERR: Error in reindex of forest pce: XDMP-REFRAGMENT: Error refragmenting xdmp:document-properties(/): XDMP-FORESTNOT: Forest pce-schemas not available: XDMP-FORESTERR: Error in merge of forest pce-schemas: XDMP-BAD: Bad ForestLabel::check, forest=pce-schemas, magic=16909060:16909060, version=50397184:83886849, pubLock=3360985868247232492:3360985868247232492, priLock=1051505488896049155:11574562360916870853 exception. Information on this page may be missing. [cid:image002.jpg@01CED983.E88B0090] Seems like all the forests related to the database got unavailable during the re-indexing. Also when we are trying to restart the MarkLogic server then stopping the server is also getting failed. [cid:image004.jpg@01CEDA26.CB2009D0] Please help. Thanks, Abhinav Kumar Mishra This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful. -- next part -- An HTML attachment was scrubbed... URL: http://developer.marklogic.com/pipermail/general/attachments/20131105/cf591b95/attachment.html -- next part -- A non-text attachment was scrubbed... Name: image001.jpg Type: image/jpeg Size: 46308 bytes Desc: image001.jpg Url : http://developer.marklogic.com/pipermail/general/attachments/20131105/cf591b95/attachment.jpg -- next part -- A non-text attachment was scrubbed... Name: image004.jpg Type: image/jpeg Size: 7397 bytes Desc: image004.jpg Url : http://developer.marklogic.com/pipermail/general/attachments/20131105/cf591b95/attachment-0001.jpg -- ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general End of General Digest, Vol 113, Issue 8 *** ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Search:parse Query
Actually to extend the parser grammar you only have to recreate the grammar configuration and add any custom starter you want so you could support both (not/-). I am using lower case in my example but you can convert to upper case search:grammar search:quotation'/search:quotation search:implicit cts:and-query strength=20 xmlns:cts=http://marklogic.com/cts/ /search:implicit search:starter strength=30 apply=grouping delimiter=)(/search:starter search:starter strength=40 apply=prefix element=cts:not-query-/search:starter search:starter strength=40 apply=prefix element=cts:not-querynot/search:starter search:joiner strength=10 apply=infix element=cts:or-query tokenize=wordor/search:joiner search:joiner strength=20 apply=infix element=cts:and-query tokenize=wordand/search:joiner search:joiner strength=50 apply=constraint:/search:joiner search:joiner strength=50 apply=constraint compare=EQ tokenize=wordeq/search:joiner search:joiner strength=50 apply=constraint compare=LT tokenize=wordlt/search:joiner search:joiner strength=50 apply=constraint compare=LE tokenize=wordle/search:joiner search:joiner strength=50 apply=constraint compare=GT tokenize=wordgt/search:joiner search:joiner strength=50 apply=constraint compare=GE tokenize=wordge/search:joiner search:joiner strength=50 apply=constraint compare=NE tokenize=wordne/search:joiner /search:grammar ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] querying with multiple fragment roots
Rob, I believe the proximity in co-occurrence is the distance from one node to another not a word(as stated in documentation). So if group is always expressed Group (A, B, C, D) Then the proximity of A-C = 2. You can confirm this by using cts:value-tuples against a single document passing the promixity. cts:value-tuples(( cts:element-reference(xs:QName(a)), cts:element-reference(xs:QName(c)) ),(proximity=2,ordered) ) Gary Vidal Media Consultant MarkLogic Corporation gary.vi...@marklogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.com ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] unable to load twitter data json document
First you should confirm the json is valid. Use a tool like jsonlint.org to validate json. After if truly valid, please submit actual json to so we can issue a bug on your behalf. Regards, Gary Vidal Media Consultant MarkLogic Corporation gary.vi...@marklogic.com Phone: +1 917 576-5794 Skype: ml-garyvidal www.marklogic.com ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] xdmp:filesystem-file-exists -- Any
Gurbeer, You can use xdmp:exists(fn:doc(/yourfile)) or fn:doc-available('/your-file') if you want to confirm a document exists in the database. It will resolve from index if you have uri index turned on. If you need to check the existence of multiple documents, then you can simulate the check similar to something like this. cts:uris((),(),cts:document-query($list-of-uris))[fn:not(. = $list-of-uris)] Regards, Gary Vdial ___ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] RE: General Digest, Vol 43, Issue 1
It would be interesting if you could have a function that would Dynamically add a score based on a predicate expression. Such as cts:boost-score($cts:query, 16, 5000) Where 1st param was the query to evaluate to the document score or quality. 2nd param would be the quality weight to add based on the expression. 3rd param would be the number of records to evaluate from current returned results such that lower result items did not have to be evaluated passed a certain result (kinda like a 2 phase query process) 1st query results using the current search 2nd boost any results that have the defined predicate expression. I wouldn't know if this is technically possible or theoretically incorrect but would be a powerful function. Just as a note I first believed that you could use nested OR expressions That would add weight if the OR expression evaluated to true [base-query] OR OR EXPR1 = range(2007 - 2000) weight: 16 OR EXPR2 = range(1999 - 1990) weight: 10 OR EXPR3 = range(1989 - 1980) weight: 6 OR EXPR4 range( 1980) weight: -16 Unfortunately the expression does not evaluate the OR Expression in computing the sore Gary Vidal Sr. .Net Developer Tel: 212-592-4946 [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Thursday, January 10, 2008 3:00 PM To: general@developer.marklogic.com Subject: General Digest, Vol 43, Issue 1 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://xqzone.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to [EMAIL PROTECTED] You can reach the person managing the list at [EMAIL PROTECTED] When replying, please edit your Subject line so it is more specific than Re: Contents of General digest... Today's Topics: 1. Influence relevancy on a term not used in the search phrase (Mattio Valentino) -- Message: 1 Date: Wed, 9 Jan 2008 15:23:11 -0500 From: Mattio Valentino [EMAIL PROTECTED] Subject: [MarkLogic Dev General] Influence relevancy on a term not used in the search phrase To: General Mark Logic Developer Discussion general@developer.marklogic.com Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=ISO-8859-1 I have a set of documents that are keyword searchable in MarkLogic. They each have the year of publication set, e.g. year1972/year. I've been asked if I can influence the relevancy of the search results based on the year of publication even if it wasn't used in the search by the user. The desired effect is that documents published recently rank higher than documents published 30 years ago. The only obvious way I can think to do this is by using document quality. Is there another approach? Thanks, Matt -- ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general End of General Digest, Vol 43, Issue 1 ** ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general
[MarkLogic Dev General] RE: XML diff (Andrew_Redhead)
Anyone Interested, Seeing there is nothing from ML about this and it is something that I have wanted for quite some time. Perhaps we should band together and start a project to write a diff library module based on the Diffx(order sensitive) algorithm or x-diff(order insensitive). I have started such a module but have not gotten far. I will share it and hopefully somebody can make it work or we can collaborate and share it back the community. Here are some resources that I used to formulate my thoughts on: http://swag.uwaterloo.ca/~rekram/publications/cascon2005-diffx-algorithm -to-detect-changes-in-xml-documents.pdf http://www.cs.wisc.edu/niagara/papers/xdiff.pdf Gary Vidal Sr. .Net Developer Tel: 212-592-4946 [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, November 07, 2007 3:00 PM To: general@developer.marklogic.com Subject: General Digest, Vol 41, Issue 5 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://xqzone.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to [EMAIL PROTECTED] You can reach the person managing the list at [EMAIL PROTECTED] When replying, please edit your Subject line so it is more specific than Re: Contents of General digest... Today's Topics: 1. RE: XML diff (Andrew_Redhead) -- Message: 1 Date: Wed, 7 Nov 2007 17:10:01 - From: Andrew_Redhead [EMAIL PROTECTED] Subject: RE: [MarkLogic Dev General] XML diff To: General Mark Logic Developer Discussion general@developer.marklogic.com Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=us-ascii It seems like this would be a popular feature, I haven't heard from anyone in ML yet but another ML user has told me off list that the answer is no. Anyone from ML care to comment? Cheers, Andy AR -Original Message- AR From: [EMAIL PROTECTED] [mailto:general- AR [EMAIL PROTECTED] On Behalf Of John Craft AR Sent: 06 November 2007 17:15 AR To: General Mark Logic Developer Discussion AR Subject: RE: [MarkLogic Dev General] XML diff AR AR All- AR AR I am curious about this, too. Does MarkLogic have any differencing AR functionality built in? AR AR John Craft AR AR AR From: [EMAIL PROTECTED] AR [mailto:[EMAIL PROTECTED] On Behalf Of AR Andrew_Redhead AR Sent: Tuesday, October 16, 2007 9:43 AM AR To: general@developer.marklogic.com AR Subject: [MarkLogic Dev General] XML diff AR AR Hi, AR AR Just wondering if there are any functions in ML to perform an XML AR diff AR between two documents? AR AR Thanks, AR AR Andy AR DISCLAIMER: AR This email (including any attachments) is intended for the sole use AR of AR the intended recipient/s and may contain material that is AR CONFIDENTIAL AR AND PRIVATE COMPANY INFORMATION. Any review or reliance by others or AR copying or distribution or forwarding of any or all of the contents AR in AR this message is STRICTLY PROHIBITED. If you are not the intended AR recipient, please contact the sender by email and delete all copies; AR your cooperation in this regard is appreciated.. AR ___ AR General mailing list AR General@developer.marklogic.com AR http://xqzone.com/mailman/listinfo/general DISCLAIMER: This email (including any attachments) is intended for the sole use of the intended recipient/s and may contain material that is CONFIDENTIAL AND PRIVATE COMPANY INFORMATION. Any review or reliance by others or copying or distribution or forwarding of any or all of the contents in this message is STRICTLY PROHIBITED. If you are not the intended recipient, please contact the sender by email and delete all copies; your cooperation in this regard is appreciated. -- ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general End of General Digest, Vol 41, Issue 5 ** lib-xmldiff.xqy Description: lib-xmldiff.xqy ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general
[MarkLogic Dev General] Finding Empty nodes
I need to construct a query that returns all documents, if a given node is empty ie byline/ How could this be achieved using CTS Search Gary Vidal Sr. .Net Developer Tel: 212-592-4946 [EMAIL PROTECTED] ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general
[MarkLogic Dev General] RE: General Digest, Vol 38, Issue 8
Wow, The beauty in its simplicity. Thank you, Gary Vidal Sr. .Net Developer Tel: 212-592-4946 [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Saturday, August 11, 2007 3:00 PM To: general@developer.marklogic.com Subject: General Digest, Vol 38, Issue 8 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://xqzone.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to [EMAIL PROTECTED] You can reach the person managing the list at [EMAIL PROTECTED] When replying, please edit your Subject line so it is more specific than Re: Contents of General digest... Today's Topics: 1. uuid implementation in Xquery (Gary Vidal) 2. Re: uuid implementation in Xquery (Michael Blakeley) -- Message: 1 Date: Fri, 10 Aug 2007 15:39:43 -0400 From: Gary Vidal [EMAIL PROTECTED] Subject: [MarkLogic Dev General] uuid implementation in Xquery To: general@developer.marklogic.com Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=us-ascii I was wondering if anybody has implemented a uuid (universally unique identifier) function in Xquery. I would like to use this to generate id's or a comparable id generation scheme Regards, Gary Vidal Sr. .Net Developer Tel: 212-592-4946 [EMAIL PROTECTED] -- next part -- An HTML attachment was scrubbed... URL: http://xqzone.marklogic.com/pipermail/general/attachments/20070810/3cc91 4f4/attachment-0001.html -- Message: 2 Date: Fri, 10 Aug 2007 16:01:21 -0700 From: Michael Blakeley [EMAIL PROTECTED] Subject: Re: [MarkLogic Dev General] uuid implementation in Xquery To: General Mark Logic Developer Discussion general@developer.marklogic.com Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=utf-8 I happen to have one in front of me right now. Note that there are at least five flavors of UUID - I chose to implement type-4. (: this is a v4 UUID :) define function generate-uuid-v4() as xs:string { let $x := concat( xdmp:integer-to-hex(xdmp:random()), xdmp:integer-to-hex(xdmp:random()) ) return string-join(( substring($x, 1, 8), substring($x, 9, 4), substring($x, 13, 4), substring($x, 17, 4), substring($x, 21, 14) ), '-' ) } Ref: http://en.wikipedia.org/wiki/UUID -- Mike Gary Vidal wrote: I was wondering if anybody has implemented a uuid (universally unique identifier) function in Xquery. I would like to use this to generate id's or a comparable id generation scheme Regards, Gary Vidal Sr. .Net Developer Tel: 212-592-4946 [EMAIL PROTECTED] ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general -- next part -- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 4532 bytes Desc: S/MIME Cryptographic Signature Url : http://xqzone.marklogic.com/pipermail/general/attachments/20070810/53d31 fba/smime-0001.bin -- ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general End of General Digest, Vol 38, Issue 8 ** ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general
[MarkLogic Dev General] RE: General Digest, Vol 37, Issue 13
I think this would work for $i in xdmp:directory(/resources/) return fn:document-uri($i) Gary Vidal American Lawyer Media Sr. .Net Developer Tel: 212-592-4946 [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Friday, July 20, 2007 3:00 PM To: general@developer.marklogic.com Subject: General Digest, Vol 37, Issue 13 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://xqzone.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to [EMAIL PROTECTED] You can reach the person managing the list at [EMAIL PROTECTED] When replying, please edit your Subject line so it is more specific than Re: Contents of General digest... Today's Topics: 1. query (Kalidasu Surada) 2. Re: query (Steve Christensen) 3. Re: query (Michael Blakeley) -- Message: 1 Date: Fri, 20 Jul 2007 13:15:21 +0530 From: Kalidasu Surada [EMAIL PROTECTED] Subject: [MarkLogic Dev General] query To: General@developer.marklogic.com Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=iso-8859-1 Hi, I am getting list of document residing in a database through this query for $i in input() return document-uri($i) but wht the thing this i want to get the documents from particular folder. Could you please do the needfull hlep Thanks and Regards, Kalidasu Surada -- next part -- An HTML attachment was scrubbed... URL: http://xqzone.marklogic.com/pipermail/general/attachments/20070720/4e132 02b/attachment-0001.html -- Message: 2 Date: Fri, 20 Jul 2007 10:33:46 -0600 From: Steve Christensen [EMAIL PROTECTED] Subject: Re: [MarkLogic Dev General] query To: General Mark Logic Developer Discussion general@developer.marklogic.com Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=us-ascii Hi, I am getting list of document residing in a database through this query for $i in input() return document-uri($i) but wht the thing this i want to get the documents from particular folder. If you are using ML 3.2, you can enable URI lexicons in your database and use the following function: http://xqzone.marklogic.com/pubs/3.2/apidocs/SearchBuiltins.html#uri-mat ch -Steve -- Message: 3 Date: Fri, 20 Jul 2007 11:08:54 -0700 From: Michael Blakeley [EMAIL PROTECTED] Subject: Re: [MarkLogic Dev General] query To: General Mark Logic Developer Discussion general@developer.marklogic.com Message-ID: [EMAIL PROTECTED] Content-Type: text/plain; charset=utf-8 See http://developer.marklogic.com/pubs/3.2/apidocs/Extension.html#directory -- Mike Kalidasu Surada wrote: Hi, I am getting list of document residing in a database through this query for $i in input() return document-uri($i) but wht the thing this i want to get the documents from particular folder. Could you please do the needfull hlep Thanks and Regards, Kalidasu Surada ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general -- next part -- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 4532 bytes Desc: S/MIME Cryptographic Signature Url : http://xqzone.marklogic.com/pipermail/general/attachments/20070720/34110 8f4/smime-0001.bin -- ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general End of General Digest, Vol 37, Issue 13 *** ___ General mailing list General@developer.marklogic.com http://xqzone.com/mailman/listinfo/general