Hi Geerat Thanks for reply. In ML it takes about 30 seconds and in elasticsearch it takes 4 seconds. It is cluster of 3 nodes. Each node has 16GB RAM and "ls /proc/cpuinfo" show 8 cores(I think it is because of hyper threading actual cores are 4). I have configured 4 forests per node. Do you think increasing/decreasing number of forests will help? As this is range index query so I guess entire index is in memory so other cache settings should not effect this query.
If I run the query with query meters I just see below cache misses, all other caches hit/miss are 0. <qm:value-cache-misses>194</qm:value-cache-misses> <qm:regexp-cache-hits>181</qm:regexp-cache-hits> <qm:regexp-cache-misses>5</qm:regexp-cache-misses> Thanks & regards, Ravinder Singh Maan On Sat, Feb 20, 2016 at 7:33 PM, <[email protected]> wrote: > Send General mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > http://developer.marklogic.com/mailman/listinfo/general > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of General digest..." > > > Today's Topics: > > 1. Re: Best way to find most occuring word or sort by frequency > (Geert Josten) > 2. Re: [1.0-ml] XDMP-TRPLIDXNOTFOUND: cts:triples() -- Triple > index not enabled (Geert Josten) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 20 Feb 2016 18:44:50 +0000 > From: Geert Josten <[email protected]> > Subject: Re: [MarkLogic Dev General] Best way to find most occuring > word or sort by frequency > To: MarkLogic Developer Discussion <[email protected]> > Message-ID: <d2ee7209.c3d73%[email protected]> > Content-Type: text/plain; charset="us-ascii" > > Hi, > > I think this is the right approach.. > > If you talk about it being slow, how slow is that exactly? And how did you > configure MarkLogic? More specifically, how many forest do you have? Also, > how much memory, and cpu cores do you have? > > Kind regards, > Geert > > > From: <[email protected]<mailto: > [email protected]>> on behalf of RAVINDER MAAN < > [email protected]<mailto:[email protected]>> > Reply-To: MarkLogic Developer Discussion <[email protected] > <mailto:[email protected]>> > Date: Saturday, February 20, 2016 at 11:34 AM > To: "[email protected]<mailto: > [email protected]>" <[email protected]<mailto: > [email protected]>> > Subject: [MarkLogic Dev General] Best way to find most occuring word or > sort by frequency > > Hello all > > I want to sort element values by frequency. I have tried below > > for $word in cts:element-values(xs:QName("ELEMENT_NAME"), (), > ("frequency-order", "limit=10")) > return <word count="{cts:frequency($word)}">{$word}</word> > > > But for very large index this is slow in comparison to elasticsearch. I > did this comparison on same machine with same data and of course only one > of them was running when I did the comparison. There are about 250 million > documents and frequency range is 1 million to hundreds i.e. if I run above > query the word on the top has count 1000000. > > Is there any other way of doing same ? > > > Thanks & regards, > Ravinder Singh Maan > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://developer.marklogic.com/pipermail/general/attachments/20160220/eecf895c/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Sat, 20 Feb 2016 19:33:41 +0000 > From: Geert Josten <[email protected]> > Subject: Re: [MarkLogic Dev General] [1.0-ml] XDMP-TRPLIDXNOTFOUND: > cts:triples() -- Triple index not enabled > To: MarkLogic Developer Discussion <[email protected]> > Message-ID: <d2ee7d69.c3ddc%[email protected]> > Content-Type: text/plain; charset="iso-8859-1" > > Hi Ga?l, > > You need to enable the triple-index. You can do that by going to the Admin > UI of your MarkLogic installation, navigating to the relevant content > database, and toggling the triple index from false to true there. It should > be around the 10th edit option, so close to the top. Confirm the change by > clicking OK at the top or bottom of the page, and then wait for the reindex > to complete. You can follow the progress on the Status tab of that > database. Refresh it once in a while to get it updated. > > Kind regards, > Geert > > From: <[email protected]<mailto: > [email protected]>> on behalf of Ga?l YIMEN YIMGA < > [email protected]<mailto:[email protected]>> > Reply-To: MarkLogic Developer Discussion <[email protected] > <mailto:[email protected]>> > Date: Saturday, February 20, 2016 at 5:46 PM > To: MarkLogic Developer Discussion <[email protected] > <mailto:[email protected]>> > Subject: [MarkLogic Dev General] [1.0-ml] XDMP-TRPLIDXNOTFOUND: > cts:triples() -- Triple index not enabled > > Hello All, > > I'm facing an issue in MarkLogic. > I ran successfully the following query > =================== > import module namespace sem = "http://marklogic.com/semantics" > at "/MarkLogic/semantics.xqy"; > > sem:rdf-insert( > ( > sem:triple( > sem:iri("http://example.org/marklogic/people/John_Smith"), > sem:iri("http://example.org/marklogic/predicate/livesIn"), > "London" > ) > , > sem:triple( > sem:iri("http://example.org/marklogic/people/Jane_Smith"), > sem:iri("http://example.org/marklogic/predicate/livesIn"), > "London" > ) > , > sem:triple( > sem:iri("http://example.org/marklogic/people/Jack_Smith"), > sem:iri("http://example.org/marklogic/predicate/livesIn"), > "Glasgow" > ) > ) > ) > =================== > > But in a secnond plan, I rand the following to count the number of triples > ======= > xquery version "1.0-ml"; > declare namespace html = "http://www.w3.org/1999/xhtml"; > fn:count(cts:triples()); > ======= > I got the following error in the image below > > [Images int?gr?es 1] > > Your help to fix this will be greatfull. > > Thanks in advance !!! > > Ga?l. > -- > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: > http://developer.marklogic.com/pipermail/general/attachments/20160220/a4c6b935/attachment.html > -------------- next part -------------- > A non-text attachment was scrubbed... > Name: image.png > Type: image/png > Size: 17507 bytes > Desc: image.png > Url : > http://developer.marklogic.com/pipermail/general/attachments/20160220/a4c6b935/attachment.png > > ------------------------------ > > _______________________________________________ > General mailing list > [email protected] > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > > > End of General Digest, Vol 140, Issue 54 > **************************************** >
_______________________________________________ General mailing list [email protected] Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general
