[MarkLogic Dev General] full stack trace
Hello, Is there a way to turn off stack trace trimming? By default it puts ellipsis after certain number of characters, which is a pain for debugging. http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] MLCP heap error
Gary, Try to set -Xmx in addition to -Xms Best, Oleksii From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Gary Larsen Sent: Thursday, November 16, 2017 4:55 PM To: 'General MarkLogic Developer Discussion' Subject: [MarkLogic Dev General] MLCP heap error Hi, Trying to use MLCP on Windows to export to an archive in local mode but can't get past a heap error. It's probably something obvious I'm missing and would appreciate the help. Thanks, Gary This is the script I created: SET JAVA_OPTS= -Xms5000m mlcp.bat export -host localhost -port 8104 -username Admin -password envisn -mode local -output_file_path C:\a-backup\MLCP\c1108 -copy_collections true -output_type archive Here's the output: C:\a-work\mlcp\mlcp-8.0.6.3\bin>rem export to archive C:\a-work\mlcp\mlcp-8.0.6.3\bin>SET JAVA_OPTS= -Xms5000m C:\a-work\mlcp\mlcp-8.0.6.3\bin>mlcp.bat export -host localhost -port 8104 -username Admin -password envisn -mode local -output_file_path C:\a-backup\MLCP\c1108 -copy_collections true -output_type archive 17/11/16 16:43:17 INFO mapreduce.MarkLogicInputFormat: Fetched 1 forest splits. 17/11/16 16:43:17 INFO mapreduce.MarkLogicInputFormat: Made 30 splits. 17/11/16 16:43:18 INFO contentpump.LocalJobRunner: completed 0% 17/11/16 16:43:27 INFO contentpump.LocalJobRunner: completed 1% 17/11/16 16:43:45 ERROR contentpump.LocalJobRunner: Error running task: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Unknown Source) at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Sourc) at java.lang.AbstractStringBuilder.append(Unknown Source) at java.lang.StringBuffer.append(Unknown Source) at com.marklogic.io.IOHelper.literalStringFromReader(IOHelper.java:50) at com.marklogic.io.IOHelper.literalStringFromStream(IOHelper.java:66) at com.marklogic.xcc.types.impl.AbstractStreamableItem.asString(AbstracStreamableItem.java:120) at com.marklogic.xcc.impl.ResultItemImpl.asString(ResultItemImpl.java:10) at com.marklogic.mapreduce.DatabaseDocument.set(DatabaseDocument.java:16) at com.marklogic.contentpump.DatabaseContentReader.nextKeyValue(DatabasContentReader.java:442) at com.marklogic.contentpump.LocalJobRunner$TrackingRecordReader.nextKeValue(LocalJobRunner.java:444) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContxtImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyVale(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at com.marklogic.contentpump.LocalJobRunner$LocalMapTask.call(LocalJobRnner.java:378) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) 17/11/16 16:44:16 INFO contentpump.LocalJobRunner: com.marklogic.mapreduce.MarkogicCounter: 17/11/16 16:44:16 INFO contentpump.LocalJobRunner: INPUT_RECORDS: 7005 17/11/16 16:44:16 INFO contentpump.LocalJobRunner: OUTPUT_RECORDS: 7005 17/11/16 16:44:16 INFO contentpump.LocalJobRunner: Total execution time: 58 sec C:\a-work\mlcp\mlcp-8.0.6.3\bin> ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] xray tests
Geert, There is nothing unusual about these tests, they call different submodules of our application. I can run xray when I cut the number of these tests roughly in half. It doesn't matter which tests to include - what matters is the amount. My colleague, who has slightly more powerful machine with more RAM, can execute all these tests without any issues. Regards, Oleksii From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten Sent: Thursday, August 31, 2017 11:32 AM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] xray tests Hi, Could you share some more detail on what is happening inside those tests? Would you be able to isolate which test is the culprit by commenting out each one by one? Cheers, Geert From: mailto:general-boun...@developer.marklogic.com>> on behalf of Oleksii Segeda mailto:oseg...@worldbankgroup.org>> Reply-To: MarkLogic Developer Discussion mailto:general@developer.marklogic.com>> Date: Thursday, August 31, 2017 at 5:19 PM To: MarkLogic Developer Discussion mailto:general@developer.marklogic.com>> Subject: [MarkLogic Dev General] xray tests Hi everyone, I have around 20-30 xray unit tests (https://github.com/robwhitby/xray). I want to run a full set of tests locally, before I deploy my code somewhere else. Unfortunately, ML dies with out of memory error. If I run each test individually it works perfectly fine, but it takes forever to go through all of them manually. I've tried to increase swap, limit the number of debug threads, limit cache sizes, etc. - nothing helps. What else can be done here? Regards, Oleksii Segeda IT Analyst Information and Technology Solutions [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] xray tests
Hi everyone, I have around 20-30 xray unit tests (https://github.com/robwhitby/xray). I want to run a full set of tests locally, before I deploy my code somewhere else. Unfortunately, ML dies with out of memory error. If I run each test individually it works perfectly fine, but it takes forever to go through all of them manually. I've tried to increase swap, limit the number of debug threads, limit cache sizes, etc. - nothing helps. What else can be done here? Regards, Oleksii Segeda IT Analyst Information and Technology Solutions [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Count of cts:element-values() not equal to number of element instances--what's going on?
Eliot, You can do something like this: cts:element-value-co-occurrences(xs:QName("prof:overall-elapsed"),xs:QName("xdmp:document")) if you have only one element per document. Best, Oleksii Segeda IT Analyst Information and Technology Solutions www.worldbank.org -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Eliot Kimber Sent: Monday, August 14, 2017 2:31 PM To: MarkLogic Developer Discussion Subject: [MarkLogic Dev General] Count of cts:element-values() not equal to number of element instances--what's going on? I have this query: let $durations := cts:element-values(xs:QName("prof:overall-elapsed"), (), "descending", cts:collection-query($collection)) And this query: let $overall-elapsed := $profiles/prof:metadata/prof:overall-elapsed Where there an element range index for prof:overall-elapsed. Comparing the two results I get very different numbers when I expected them to be equal: 47539 21219 Doing this: count(distinct-values($overall-elapsed ! xs:dayTimeDuration(.)) Returns 21219, making it clear that the range index is returning distinct values, not all values. It makes sense in terms of how I would expect a range index to be structured (a one-to-many mapping for values to elements) but doesn’t make sense as the return for a function named “element-values” (and not element-distinct-values). I didn’t see this behavior mentioned in the docs (although the introduction to the Lexicon reference section does describe lexicons as sets of unique values). My requirement is to *quickly* get a list of the durations for all prof:expression elements (which I use for both counting and for bucketing, so I need all values, not just all distinct values). Is there a way to do what I want using only indexes? Thanks, E. -- Eliot Kimber http://contrext.com ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] fitness score
Hi everyone, I calculate relevance of a single document to some query by doing this: cts:fitness( cts:search( fn:doc(), cts:and-query(( cts:document-query(... document uri ...), ... some query ... )), ("score-logtf","unfiltered") ) ) I want to do the same thing, but for an element, without ingesting this element to my database. Any ideas? Oleksii Segeda IT Analyst Information and Technology Solutions [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] special character using fn:replace and regex
You can use regex: let $input := "$10,000 value $6,000 , Not able to find other" return fn:replace($input, "(\$[0-9]+),([0-9]+)", "$1$2") => $1 value $6000 , Not able to find other Oleksii Segeda IT Analyst Information and Technology Solutions [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of vikas.sin...@cognizant.com Sent: Thursday, August 3, 2017 9:33 AM To: general@developer.marklogic.com Subject: [MarkLogic Dev General] special character using fn:replace and regex Hi All, How to replace special character using fn:replace and value .I want to remove special character "," for both the $ values but other"," should exist . Example: Input : " $10,000 value $6,000 , Not able to find other" After using functx library I am able to get "$10" and "$6" now I want to replace these corresponding value to "$10,000" and "$6,000" respectively I am using fn:replace($input,"$10,",$10) but unable to replace "$10," with "$10" output : $1 value $6000 , Not able to find other This e-mail and any files transmitted with it are for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient(s), please reply to the sender and destroy all copies of the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email, and/or any action taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful. Where permitted by applicable law, this e-mail and other e-mail communications sent to and from Cognizant e-mail addresses may be monitored. ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] indexes
Hi, Does this mean that index metadata stored along with documents? Was there a good reason to do so? Wouldn't it be better to mark this data for deletion and delete it during database merges? I'm asking because reindexing is a pain when you have terabytes of data. Thanks, From: general-boun...@developer.marklogic.com on behalf of John Snelson Sent: Tuesday, July 11, 2017 5:58:35 PM To: general@developer.marklogic.com Subject: Re: [MarkLogic Dev General] indexes To reclaim the space used by the range index. On 11/07/2017 14:11, Oleksii Segeda wrote: Hi, Question to ML engineers here. Can someone explain why deletion of a range index causes reindexing? Thanks. Oleksii Segeda IT Analyst Information and Technology Solutions T +12024736798 E oseg...@worldbankgroup.org<mailto:oseg...@worldbankgroup.org> W www.worldbank.org<http://www.worldbank.org/> [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] ___ General mailing list General@developer.marklogic.com<mailto:General@developer.marklogic.com> Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general -- John Snelson, Principal Engineer http://twitter.com/jpcs MarkLogic Corporation http://www.marklogic.com ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] indexes
Hi, Question to ML engineers here. Can someone explain why deletion of a range index causes reindexing? Thanks. Oleksii Segeda IT Analyst Information and Technology Solutions T +12024736798 E oseg...@worldbankgroup.org<mailto:oseg...@worldbankgroup.org> W www.worldbank.org<http://www.worldbank.org/> [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] cts:element-value-match for integers
Christopher, It gives false positives if I use it with cts:element-values. Shan, The rule is to find all values which start with given value. For example, 200 should match 200, 2001, 2002, ... 20010, 20020, 2002123, etc.. Are you suggesting to guess all possible combinations? If so, it's not possible. As I said, I need something like this (pseudo code): cts:element-value-match(xs:QName("element"), "200*") except that I don't have a string range index on that field, but I do have an int range index instead. Best, Oleksii. -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Shan Jiang Sent: Monday, June 19, 2017 2:06 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] cts:element-value-match for integers What is your exact search rule? From your example, looks like you try to look for another number by adding a ³0². If that is the case, can you run a cts:or-query, one for 200, and one for 2000? Shan Jiang Principal Consultant MarkLogic Corporation shan.ji...@marklogic.com Phone: +1 703 869 4672 www.marklogic.com <http://www.marklogic.com/> On 6/19/17, 12:59 PM, "general-boun...@developer.marklogic.com on behalf of Oleksii Segeda" wrote: >Hi everyone, > >Any thoughts on this? > >Oleksii. > > >-Original Message- >From: Oleksii Segeda >Sent: Friday, June 16, 2017 6:16 PM >To: general@developer.marklogic.com >Subject: cts:element-value-match for integers > >Hi everyone, > >Can someone explain how does cts:element-value-match work with integer >indexes? I cannot pass a string as a second argument, so it's unclear how >to do a wildcarded search. >Ultimate goal is to find 2000 and 200, if user typed 200. I understand >that I can create an additional string index, but I want to know if a >better solution exists. > >Thanks. > >Oleksii Segeda >IT Analyst >Information and Technology Solutions >www.worldbank.org > > > > >___ >General mailing list >General@developer.marklogic.com >Manage your subscription at: >http://developer.marklogic.com/mailman/listinfo/general ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] cts:element-value-match for integers
Hi everyone, Any thoughts on this? Oleksii. -Original Message- From: Oleksii Segeda Sent: Friday, June 16, 2017 6:16 PM To: general@developer.marklogic.com Subject: cts:element-value-match for integers Hi everyone, Can someone explain how does cts:element-value-match work with integer indexes? I cannot pass a string as a second argument, so it's unclear how to do a wildcarded search. Ultimate goal is to find 2000 and 200, if user typed 200. I understand that I can create an additional string index, but I want to know if a better solution exists. Thanks. Oleksii Segeda IT Analyst Information and Technology Solutions www.worldbank.org ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] cts:element-value-match for integers
Hi everyone, Can someone explain how does cts:element-value-match work with integer indexes? I cannot pass a string as a second argument, so it's unclear how to do a wildcarded search. Ultimate goal is to find 2000 and 200, if user typed 200. I understand that I can create an additional string index, but I want to know if a better solution exists. Thanks. Oleksii Segeda IT Analyst Information and Technology Solutions www.worldbank.org ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Videos from ML World 2017
Hi everyone, Are there any plans to update Youtube channel or a website with videos from ML World 2017? Regards, Oleksii Segeda IT Analyst Information and Technology Solutions www.worldbank.org ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Priorities for queries
Hi Gary, I want to prioritize it because the application can use up to 90% of maximum storage IOPS and it’s normal (we have continuous ingestion with a lot of writes/full document lookups). Unfortunately, SSD is not an option at this time. Regards, Oleksii From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Gary Vidal Sent: Thursday, May 25, 2017 6:44 AM To: general@developer.marklogic.com Subject: [MarkLogic Dev General] Priorities for queries Olekseii, Why would you want to prioritize queries the way you expressed? It would not make sense to deprioritize disk i/o from happening unless you have some issues with disk performance. Consider disk i/o from stand merges to be a natural part of doing business in MarkLogic and any system that does "log level compaction". If you are creating documents in bulk and at same time running queries there are a few techniques you could employ, such as using "fast-data directory" attached to SSD or figure out why your disk's are slow using dd command. Again without knowing your write/read patterns and cardinalities/shape of your data its a very hard problem to answer correctly. But you may want to look at pausing stand merges using blackouts for periods of high query load. But this should be done with extreme caution to your query patterns. Happy to discuss directly with you. Feel free to email for that discussion. Regards, Gary Vidal ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Priorities for queries
Gary, Please correct me if I’m wrong, but this will only parallelize queries without addressing priorities. This means if one of them creates a lot of disk IO, the second one hangs. Best, Oleksii From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Gary Vidal Sent: Wednesday, May 24, 2017 6:58 AM To: general@developer.marklogic.com Subject: [MarkLogic Dev General] Priorities for queries Oleksii, Why dont you just create 2 app servers. 1 for query traffic and 1 for admin Regards Gary ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Priorities for queries
Hi Geert, It makes sense. I guess on first query we can only return a ticket number, which can be used to access results. Best, Oleksii From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Geert Josten Sent: Tuesday, May 23, 2017 3:25 AM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Priorities for queries Hi Oleksii, If you use xdmp:spawn or xdmp:spawn-function, you would be able to use the option. It takes 'normal' and 'higher' as values. These priorities have separate queues and worker threads, so they should interfere less with each other. It might also be worth looking into a way to push out low priority work to a dedicated host for longer running tasks. You could do that by writing such queries to the database, have a schedule running on that particular host monitor for such tasks, which picks them up 1 by 1, and writes back results once done. It might be easiest to switch around script queries to an asynchronous process that polls regularly to see if results have been written. Makes sense? Cheers, Geert From: mailto:general-boun...@developer.marklogic.com>> on behalf of Oleksii Segeda mailto:oseg...@worldbankgroup.org>> Reply-To: MarkLogic Developer Discussion mailto:general@developer.marklogic.com>> Date: Monday, May 22, 2017 at 8:59 PM To: "general@developer.marklogic.com<mailto:general@developer.marklogic.com>" mailto:general@developer.marklogic.com>> Subject: [MarkLogic Dev General] Priorities for queries Hi, Is there a way to give a lower priority to certain queries? We have two different types of API consumers - real users and various scripts. No matter how often scripts are hitting endpoints or how "heavy" are their queries, they should not affect API performance for real users. In other words, scripts are tolerant of high latency, but users are not. Regards, Oleksii Segeda IT Analyst Information and Technology Solutions W www.worldbank.org<http://www.worldbank.org/> [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Priorities for queries
Hi, Is there a way to give a lower priority to certain queries? We have two different types of API consumers - real users and various scripts. No matter how often scripts are hitting endpoints or how "heavy" are their queries, they should not affect API performance for real users. In other words, scripts are tolerant of high latency, but users are not. Regards, Oleksii Segeda IT Analyst Information and Technology Solutions W www.worldbank.org<http://www.worldbank.org/> [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] xdmp:parse-dateTime
Hi everyone, The docs says that xdmp:parse-dateTime will not return the correct dateTime value for dates before October 15, 1582. What should I use for dates before October 15, 1582? Regards, Oleksii Segeda IT Analyst Information and Technology Solutions [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Regular Expressions
Hi Erik, Unfortunately, all my codebase is in XQuery, but I need to use word-boundaries and non-matching groups. So far the only idea that comes to my mind is to use xdmp:javascript-eval. I'm curious if this approach is considered as a normal practice. Best, Oleksii Segeda IT Analyst Information and Technology Solutions [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Erik Hennum Sent: Wednesday, March 22, 2017 4:58 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Regular Expressions Hi, Oleksii: Regarding question 2, aside from a few edge cases, the MarkLogic libraries have the same core implementation with JavaScript and XQuery interfaces. The core behavior of functions in the MarkLogic libraries are (in almost every case) consistent across environments. If you are working in JavaScript and the regex implementation from v8 is a good fit for your requirements, you should take advantage of JavaScript regex objects and methods. Hoping that clarifies, Erik Hennum From: general-boun...@developer.marklogic.com<mailto:general-boun...@developer.marklogic.com> [general-boun...@developer.marklogic.com] on behalf of Sewell, David R. (drs2n) [dsew...@virginia.edu] Sent: Wednesday, March 22, 2017 1:40 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Regular Expressions I'm not sure what the answer is to question 2, but for question 1, the answer is that MarkLogic's implementation of XPath doesn't support the \b character escape because it is not included in the XPath specification for regular expressions, which itself is based on "XML Schema Part 2: Datatypes Second Edition". The only single-character escapes are these: https://www.w3.org/TR/xmlschema-2/#nt-charClassEsc Some XSLT and XQuery processors support extended regular expressions as a proprietary feature (for example, Saxon has a semi-documented extension that allows full Java regex), but MarkLogic doesn't (unless there is undocumented support that I don't know about). David On Mar 22, 2017, at 3:55 PM, Oleksii Segeda mailto:oseg...@worldbankgroup.org>> wrote: Hi everyone, Quick questions regarding regex in ML: 1. What's ML alternative to word boundaries \b? Seems that fn:analyze-string doesn't support this special character. 2. Does JS version of this function (fn.analyzeString) use JS regex engine? If so, why it gives me error for fn.analyzeString("foo bar bar", "\\b(bar)\\b") ? Regards, Oleksii Segeda IT Analyst Information and Technology Solutions ___ General mailing list General@developer.marklogic.com<mailto:General@developer.marklogic.com> Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] Regular Expressions
Hi everyone, Quick questions regarding regex in ML: 1. What's ML alternative to word boundaries \b? Seems that fn:analyze-string doesn't support this special character. 2. Does JS version of this function (fn.analyzeString) use JS regex engine? If so, why it gives me error for fn.analyzeString("foo bar bar", "\\b(bar)\\b") ? Regards, Oleksii Segeda IT Analyst Information and Technology Solutions [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
[MarkLogic Dev General] transform
Hi all, When I'm trying to use transforms with DocumentManager (Java API), MarkLogic gives me this error: XDMP-MULTIPART-DONE: xdmp:document-load("rest::", /original/4ea94612..) -- All parts are already processed Please advise. Regards, Oleksii Segeda IT Analyst Information and Technology Solutions ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
Re: [MarkLogic Dev General] Custom search grammar
Hi Erik, Did you figure out how to extend the grammar? Regards, Oleksii Segeda IT Analyst Information and Technology Solutions -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Oleksii Segeda Sent: Monday, January 30, 2017 3:09 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Custom search grammar Hi Erik, Yes, that's is desired behavior. Ideally, I would like to avoid custom constraints, simply because search grammar looks cleaner in the search box. In addition, some of our users are already familiar with simple search operators like AND, OR, so BOOST won't look like an alien to them. I guess a postprocessing can be used as you suggested, however I'm interested in custom search grammar, because I may need to extend it more in the future. Thank you, Oleksii Segeda IT Analyst Information and Technology Solutions -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Erik Hennum Sent: Monday, January 30, 2017 2:42 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Custom search grammar Hi, Oleksii: Thanks for providing more detail. Just to confirm, is it clear that, in a boost query, the right-hand term is optional? Documents with only the left-hand term will still appear in the results though with less relevance than documents that have both terms. By contrast, AND-related terms are both required and both contribute to relevance. Anyway, to increase weight, one approach would be to define a tag for a quoted phrase and pass the phrase to a Search API custom constraint or to cts:parse() with a binding to a query generator function: http://docs.marklogic.com/guide/search-dev/cts_query#id_13456 The custom code could then tokenize the phrase and combine the terms with a boost-query or and-query, adding appropriate weight. Another approach would be to do postprocessing of the query tree returned by cts:parse() or search:parse() to replace the default boost-query or and-query with a query that has more weight. In either approach, you would then search on the query. I mention cts:parse() because it parses query text more quickly than search:parse() Hoping that helps, Erik Hennum From: general-boun...@developer.marklogic.com [general-boun...@developer.marklogic.com] on behalf of Oleksii Segeda [oseg...@worldbankgroup.org] Sent: Monday, January 30, 2017 10:55 AM To: general@developer.marklogic.com Subject: Re: [MarkLogic Dev General] Custom search grammar Hi Erik, I'm trying to boost some parts of search query. For example, if user types `trade BOOST water`, I want documents with the word "water" to be higher in the results. cts:boost-query seems to be a perfect fit, but the default BOOST doesn't let you specify weights. My ultimate goal is to convert `trade BOOST water` to something like this: cts:boost-query(cts:word-query("trade"), cts:word-query("water", (), 10.0) ) Regards, Oleksii Segeda IT Analyst Information and Technology Solutions -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of general-requ...@developer.marklogic.com Sent: Monday, January 30, 2017 1:08 PM To: general@developer.marklogic.com Subject: General Digest, Vol 151, Issue 42 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to general-requ...@developer.marklogic.com You can reach the person managing the list at general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than "Re: Contents of General digest..." Today's Topics: 1. Custom search grammar (Oleksii Segeda) 2. Re: Custom search grammar (Erik Hennum) ---------- Message: 1 Date: Mon, 30 Jan 2017 16:51:26 + From: Oleksii Segeda Subject: [MarkLogic Dev General] Custom search grammar To: "general@developer.marklogic.com" Message-ID: Content-Type: text/plain; charset="us-ascii" Hi there, I'm trying to declare a custom search grammar. I declared a custom function via search options, which supposed to parse "BOOST" keyword: http://worldbankgroup.org/search/grammar"; at="/lib/grammar-boost.xqy" tokenize="word">BOOST I declared this function and just copied existing implementation from impl:joiner-boost function in /MarkLogic/appservices/search/search-impl.xqy : declare function gramm
Re: [MarkLogic Dev General] Custom search grammar
Hi Erik, Yes, that's is desired behavior. Ideally, I would like to avoid custom constraints, simply because search grammar looks cleaner in the search box. In addition, some of our users are already familiar with simple search operators like AND, OR, so BOOST won't look like an alien to them. I guess a postprocessing can be used as you suggested, however I'm interested in custom search grammar, because I may need to extend it more in the future. Thank you, Oleksii Segeda IT Analyst Information and Technology Solutions -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of Erik Hennum Sent: Monday, January 30, 2017 2:42 PM To: MarkLogic Developer Discussion Subject: Re: [MarkLogic Dev General] Custom search grammar Hi, Oleksii: Thanks for providing more detail. Just to confirm, is it clear that, in a boost query, the right-hand term is optional? Documents with only the left-hand term will still appear in the results though with less relevance than documents that have both terms. By contrast, AND-related terms are both required and both contribute to relevance. Anyway, to increase weight, one approach would be to define a tag for a quoted phrase and pass the phrase to a Search API custom constraint or to cts:parse() with a binding to a query generator function: http://docs.marklogic.com/guide/search-dev/cts_query#id_13456 The custom code could then tokenize the phrase and combine the terms with a boost-query or and-query, adding appropriate weight. Another approach would be to do postprocessing of the query tree returned by cts:parse() or search:parse() to replace the default boost-query or and-query with a query that has more weight. In either approach, you would then search on the query. I mention cts:parse() because it parses query text more quickly than search:parse() Hoping that helps, Erik Hennum From: general-boun...@developer.marklogic.com [general-boun...@developer.marklogic.com] on behalf of Oleksii Segeda [oseg...@worldbankgroup.org] Sent: Monday, January 30, 2017 10:55 AM To: general@developer.marklogic.com Subject: Re: [MarkLogic Dev General] Custom search grammar Hi Erik, I'm trying to boost some parts of search query. For example, if user types `trade BOOST water`, I want documents with the word "water" to be higher in the results. cts:boost-query seems to be a perfect fit, but the default BOOST doesn't let you specify weights. My ultimate goal is to convert `trade BOOST water` to something like this: cts:boost-query(cts:word-query("trade"), cts:word-query("water", (), 10.0) ) Regards, Oleksii Segeda IT Analyst Information and Technology Solutions -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of general-requ...@developer.marklogic.com Sent: Monday, January 30, 2017 1:08 PM To: general@developer.marklogic.com Subject: General Digest, Vol 151, Issue 42 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to general-requ...@developer.marklogic.com You can reach the person managing the list at general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than "Re: Contents of General digest..." Today's Topics: 1. Custom search grammar (Oleksii Segeda) 2. Re: Custom search grammar (Erik Hennum) ---------- Message: 1 Date: Mon, 30 Jan 2017 16:51:26 + From: Oleksii Segeda Subject: [MarkLogic Dev General] Custom search grammar To: "general@developer.marklogic.com" Message-ID: Content-Type: text/plain; charset="us-ascii" Hi there, I'm trying to declare a custom search grammar. I declared a custom function via search options, which supposed to parse "BOOST" keyword: http://worldbankgroup.org/search/grammar"; at="/lib/grammar-boost.xqy" tokenize="word">BOOST I declared this function and just copied existing implementation from impl:joiner-boost function in /MarkLogic/appservices/search/search-impl.xqy : declare function grammar:custom-boost($ps as map:map, $left as element()?, $opts as element()?) as schema-element(cts:query) { let $symbol := impl:symbol-lookup($ps) let $_ := tdop:advance($ps) let $expr1 := tdop:expression($ps, $symbol/@strength) return if (empty($left)) then ($left, impl:msg($ps, )) else element { xs:QName($symbol/@element) } { attribute qtex
Re: [MarkLogic Dev General] Custom search grammar
Hi Erik, I'm trying to boost some parts of search query. For example, if user types `trade BOOST water`, I want documents with the word "water" to be higher in the results. cts:boost-query seems to be a perfect fit, but the default BOOST doesn't let you specify weights. My ultimate goal is to convert `trade BOOST water` to something like this: cts:boost-query(cts:word-query("trade"), cts:word-query("water", (), 10.0) ) Regards, Oleksii Segeda IT Analyst Information and Technology Solutions -Original Message- From: general-boun...@developer.marklogic.com [mailto:general-boun...@developer.marklogic.com] On Behalf Of general-requ...@developer.marklogic.com Sent: Monday, January 30, 2017 1:08 PM To: general@developer.marklogic.com Subject: General Digest, Vol 151, Issue 42 Send General mailing list submissions to general@developer.marklogic.com To subscribe or unsubscribe via the World Wide Web, visit http://developer.marklogic.com/mailman/listinfo/general or, via email, send a message with subject or body 'help' to general-requ...@developer.marklogic.com You can reach the person managing the list at general-ow...@developer.marklogic.com When replying, please edit your Subject line so it is more specific than "Re: Contents of General digest..." Today's Topics: 1. Custom search grammar (Oleksii Segeda) 2. Re: Custom search grammar (Erik Hennum) ------ Message: 1 Date: Mon, 30 Jan 2017 16:51:26 + From: Oleksii Segeda Subject: [MarkLogic Dev General] Custom search grammar To: "general@developer.marklogic.com" Message-ID: Content-Type: text/plain; charset="us-ascii" Hi there, I'm trying to declare a custom search grammar. I declared a custom function via search options, which supposed to parse "BOOST" keyword: http://worldbankgroup.org/search/grammar"; at="/lib/grammar-boost.xqy" tokenize="word">BOOST I declared this function and just copied existing implementation from impl:joiner-boost function in /MarkLogic/appservices/search/search-impl.xqy : declare function grammar:custom-boost($ps as map:map, $left as element()?, $opts as element()?) as schema-element(cts:query) { let $symbol := impl:symbol-lookup($ps) let $_ := tdop:advance($ps) let $expr1 := tdop:expression($ps, $symbol/@strength) return if (empty($left)) then ($left, impl:msg($ps, )) else element { xs:QName($symbol/@element) } { attribute qtextjoin {concat($symbol/string())}, attribute strength {$symbol/@strength}, attribute qtextgroup { impl:opts($ps)/opt:grammar/opt:starter[@apply eq "grouping"]/(string(), @delimiter/string()) }, for $opt in $symbol/@options/tokenize(normalize-space(.)<mailto:$symbol/@options/tokenize(normalize-space(.)>, "\s") return {$opt}, element cts:matching-query { attribute qtextref { "schema-element(cts:query)" }, $left }, element cts:boosting-query { attribute qtextref { "schema-element(cts:query)" }, $expr1 } } }; Unfortunately this doesn't work, because for some reason impl:symbol-lookup returns an empty sequence. Any ideas what went wrong here? Oleksii Segeda IT Analyst Information and Technology Solutions [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] -- next part -- An HTML attachment was scrubbed... URL: http://developer.marklogic.com/pipermail/general/attachments/20170130/1958bd77/attachment-0001.html -- next part -- A non-text attachment was scrubbed... Name: image003.png Type: image/png Size: 6577 bytes Desc: image003.png Url : http://developer.marklogic.com/pipermail/general/attachments/20170130/1958bd77/attachment-0002.png -- next part -- A non-text attachment was scrubbed... Name: image004.png Type: image/png Size: 170 bytes Desc: image004.png Url : http://developer.marklogic.com/pipermail/general/attachments/20170130/1958bd77/attachment-0003.png -- Message: 2 Date: Mon, 30 Jan 2017 18:07:41 + From: Erik Hennum Subject: Re: [MarkLogic Dev General] Custom search grammar To: MarkLogic Developer Discussion Message-ID: Content-Type: text/plain; charset="windows-1252" Hi, Oleksii: Can you explain what you are trying to accomplish? There may be better ways of doing the same thing than creating a custom grammar, which is really a tool of last resort. For instance, a custom cons
[MarkLogic Dev General] Custom search grammar
Hi there, I'm trying to declare a custom search grammar. I declared a custom function via search options, which supposed to parse "BOOST" keyword: http://worldbankgroup.org/search/grammar"; at="/lib/grammar-boost.xqy" tokenize="word">BOOST I declared this function and just copied existing implementation from impl:joiner-boost function in /MarkLogic/appservices/search/search-impl.xqy : declare function grammar:custom-boost($ps as map:map, $left as element()?, $opts as element()?) as schema-element(cts:query) { let $symbol := impl:symbol-lookup($ps) let $_ := tdop:advance($ps) let $expr1 := tdop:expression($ps, $symbol/@strength) return if (empty($left)) then ($left, impl:msg($ps, )) else element { xs:QName($symbol/@element) } { attribute qtextjoin {concat($symbol/string())}, attribute strength {$symbol/@strength}, attribute qtextgroup { impl:opts($ps)/opt:grammar/opt:starter[@apply eq "grouping"]/(string(), @delimiter/string()) }, for $opt in $symbol/@options/tokenize(normalize-space(.)<mailto:$symbol/@options/tokenize(normalize-space(.)>, "\s") return {$opt}, element cts:matching-query { attribute qtextref { "schema-element(cts:query)" }, $left }, element cts:boosting-query { attribute qtextref { "schema-element(cts:query)" }, $expr1 } } }; Unfortunately this doesn't work, because for some reason impl:symbol-lookup returns an empty sequence. Any ideas what went wrong here? Oleksii Segeda IT Analyst Information and Technology Solutions [http://siteresources.worldbank.org/NEWS/Images/spacer.png] [http://siteresources.worldbank.org/NEWS/Images/WBG_Information_and_Technology_Solutions.png] ___ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general