On Tue, May 23, 2017 at 9:57 AM, Hanumantharayappa, Shanthamurthy < shantha...@qssinc.com> wrote:
> Unsubscribe me. > > -----Original Message----- > From: general-boun...@developer.marklogic.com [mailto:general-bounces@ > developer.marklogic.com] On Behalf Of general-request@developer. > marklogic.com > Sent: Tuesday, May 23, 2017 9:56 AM > To: general@developer.marklogic.com > Subject: General Digest, Vol 155, Issue 27 > > Send General mailing list submissions to > general@developer.marklogic.com > > To subscribe or unsubscribe via the World Wide Web, visit > http://developer.marklogic.com/mailman/listinfo/general > or, via email, send a message with subject or body 'help' to > general-requ...@developer.marklogic.com > > You can reach the person managing the list at > general-ow...@developer.marklogic.com > > When replying, please edit your Subject line so it is more specific than > "Re: Contents of General digest..." > > > Today's Topics: > > 1. Re: General Digest, Vol 155, Issue 24 (Shiv Shankar) > 2. Re: Processing Large Number of Docs to Get Statistics > (Erik Hennum) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Tue, 23 May 2017 09:55:35 -0400 > From: Shiv Shankar <shiv.shivshan...@gmail.com> > Subject: Re: [MarkLogic Dev General] General Digest, Vol 155, Issue 24 > To: MarkLogic Developer Discussion <general@developer.marklogic.com> > Message-ID: > <cafyr2h6prmjzd9plh85kp7xpkuqgn+kjmk8haygeaq9oavf...@mail.gmail.com> > Content-Type: text/plain; charset="utf-8" > > Hi Justin, > Thanks for introducing jsonPropertyReference. I see some of the dob is > empty ("") and it is kicking exception. > I see a bit of challenging here. I could able aggregate numbers/count by > using cts.values, but facing difficulties in aggregating by datewise and > grouping them. > > The scenario is > content: {"dob":"1977-06-20", "dob":"", > "dob":"1980-06-20","dob":"1977-06-20"} > > Expected result is {"0": 20, "1": 2, "4": 0}; > > I am using the below workaround for the age, but we want to achieve > similar to this using dob by ignoring any missing dobs. Any help? > > var ageQuery = cts.andQuery([ > cts.elementRangeQuery(xs.QName('age'), ">=", 0), > cts.elementRangeQuery(xs.QName('age'), "<=", 100) > ]); > var result = {}; > for (var agegroup of cts.values(cts.elementReference(xs.QName('age')), > null, null, query)){ > var query = cts.andQuery([ > ageQuery, > cts.jsonPropertyValueQuery('age', agegroup) > ]); > result[agegroup] = cts.estimate(query); } result; > > > > Thanks > Shan. > > > On Tue, May 23, 2017 at 3:24 AM, <general-requ...@developer.marklogic.com> > wrote: > > > Send General mailing list submissions to > > general@developer.marklogic.com > > > > To subscribe or unsubscribe via the World Wide Web, visit > > http://developer.marklogic.com/mailman/listinfo/general > > or, via email, send a message with subject or body 'help' to > > general-requ...@developer.marklogic.com > > > > You can reach the person managing the list at > > general-ow...@developer.marklogic.com > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of General digest..." > > > > > > Today's Topics: > > > > 1. Search by age wise from dob property (Shiv Shankar) > > 2. Re: Search by age wise from dob property (Justin Makeig) > > 3. Processing Large Number of Docs to Get Statistics (Eliot Kimber) > > 4. Re: Priorities for queries (Geert Josten) > > > > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Mon, 22 May 2017 16:08:11 -0400 > > From: Shiv Shankar <shiv.shivshan...@gmail.com> > > Subject: [MarkLogic Dev General] Search by age wise from dob property > > To: MarkLogic Developer Discussion <general@developer.marklogic.com> > > Message-ID: > > <CAFyr2H5Y5JR7kVfg4NFa4i-4xH1wvNskJg4reihCyPO6oecXHQ@ > > mail.gmail.com> > > Content-Type: text/plain; charset="utf-8" > > > > Hi, > > There is a dob json property in the documents and I need to search on > > dob based on age wise i.e age > 30, age >30 and age <50. Any samples > > to calculate age and compare in the search queries? > > > > Thanks > > Shan. > > -------------- next part -------------- An HTML attachment was > > scrubbed... > > URL: http://developer.marklogic.com/pipermail/general/ > > attachments/20170522/34d2f34c/attachment-0001.html > > > > ------------------------------ > > > > Message: 2 > > Date: Mon, 22 May 2017 20:33:06 +0000 > > From: Justin Makeig <justin.mak...@marklogic.com> > > Subject: Re: [MarkLogic Dev General] Search by age wise from dob > > property > > To: MarkLogic Developer Discussion <general@developer.marklogic.com> > > Message-ID: <600acf55-4384-4294-bbf9-74700a4a8...@marklogic.com> > > Content-Type: text/plain; charset="us-ascii" > > > > const person = { dob: xs.date("1979-02-03") }; const thrirtyYearsAgo = > > fn.currentDate().subtract(xs. > > yearMonthDuration("P30Y")); > > person.dob < thrirtyYearsAgo; // true > > > > You can do date math with xs.duration types. In the above case, I'm > > subtracting 30 years from the current date. > > xs.date.prototype.subtract() returns an xs.date. You can compare that > > xs.date to any other xs.date. To do this comparison in MarkLogic's > > indexes you'll need to create a range index > > <https://docs.marklogic.com/guide/concepts/indexing#id_51573>. A range > > index, as its name implies, queries efficiently for ranges of typed > values, for example, dates less than thirty ago from today. > > > > cts.rangeQuery(cts.jsonPropertyReference('dob'), '<', > > thrirtyYearsAgo); // requires a range index of type xs:date on the dob > > JSON property > > > > > > Justin > > > > > > > On May 22, 2017, at 1:08 PM, Shiv Shankar > > > <shiv.shivshan...@gmail.com> > > wrote: > > > > > > Hi, > > > There is a dob json property in the documents and I need to search > > > on > > dob based on age wise i.e age > 30, age >30 and age <50. Any samples > > to calculate age and compare in the search queries? > > > > > > Thanks > > > Shan. > > > > > > > > > > > > _______________________________________________ > > > General mailing list > > > General@developer.marklogic.com > > > Manage your subscription at: > > > http://developer.marklogic.com/mailman/listinfo/general > > > > > > > > ------------------------------ > > > > Message: 3 > > Date: Mon, 22 May 2017 22:43:26 -0500 > > From: Eliot Kimber <ekim...@contrext.com> > > Subject: [MarkLogic Dev General] Processing Large Number of Docs to > > Get Statistics > > To: MarkLogic Developer Discussion <general@developer.marklogic.com> > > Message-ID: <bdf9d2b1-c160-455d-b836-bc11c1db7...@contrext.com> > > Content-Type: text/plain; charset="UTF-8" > > > > I haven?t yet seen anything in the docs that directly address what I?m > > trying to do and suspect I?m simply missing some ML basics or just > > going about things the wrong way. > > > > I have a corpus of several hundred thousand docs (but could be > > millions, of course), where each doc is an average of 200K and several > > thousand elements. > > > > I want to analyze the corpus to get details about the number of > > specific subelements within each document, e.g.: > > > > > > for $article in cts:search(/Article, cts:directory-query("/Default/", > > "infinity"))[$start to $end] > > return <article-counts id=?{$article/@id}? > > paras=?{count($article//p}?/> > > > > I?m running this as a query from Oxygen (so I can capture the results > > locally so I can do other stuff with them). > > > > On the server I?m using I blow the expanded tree cache if I try to > > request more than about 20,000 docs. > > > > Is there a way to do this kind of processing over an arbitrarily large > > set > > *and* get the results back from a single query request? > > > > I think the only solution is to write the results to back to the > > database and then fetch that as the last thing but I was hoping there > > was something simpler. > > > > Have I missed an obvious solution? > > > > Thanks, > > > > Eliot > > > > -- > > Eliot Kimber > > http://contrext.com > > > > > > > > > > > > > > ------------------------------ > > > > Message: 4 > > Date: Tue, 23 May 2017 07:24:31 +0000 > > From: Geert Josten <geert.jos...@marklogic.com> > > Subject: Re: [MarkLogic Dev General] Priorities for queries > > To: MarkLogic Developer Discussion <general@developer.marklogic.com> > > Message-ID: <d549af3b.117ddb%geert.jos...@marklogic.com> > > Content-Type: text/plain; charset="windows-1252" > > > > Hi Oleksii, > > > > If you use xdmp:spawn or xdmp:spawn-function, you would be able to use > > the <priority> option. It takes ?normal? and ?higher? as values. These > > priorities have separate queues and worker threads, so they should > > interfere less with each other. > > > > It might also be worth looking into a way to push out low priority > > work to a dedicated host for longer running tasks. You could do that > > by writing such queries to the database, have a schedule running on > > that particular host monitor for such tasks, which picks them up 1 by > > 1, and writes back results once done. It might be easiest to switch > > around script queries to an asynchronous process that polls regularly > > to see if results have been written. Makes sense? > > > > Cheers, > > Geert > > > > From: <general-boun...@developer.marklogic.com<mailto:general- > > boun...@developer.marklogic.com>> on behalf of Oleksii Segeda < > > oseg...@worldbankgroup.org<mailto:oseg...@worldbankgroup.org>> > > Reply-To: MarkLogic Developer Discussion > > <general@developer.marklogic.com > > <mailto:general@developer.marklogic.com>> > > Date: Monday, May 22, 2017 at 8:59 PM > > To: "general@developer.marklogic.com<mailto:general@developer. > > marklogic.com>" <general@developer.marklogic.com<mailto: > general@developer. > > marklogic.com>> > > Subject: [MarkLogic Dev General] Priorities for queries > > > > Hi, > > > > Is there a way to give a lower priority to certain queries? We have > > two different types of API consumers ? real users and various scripts. > > No matter how often scripts are hitting endpoints or how ?heavy? are > > their queries, they should not affect API performance for real users. > > In other words, scripts are tolerant of high latency, but users are not. > > > > Regards, > > > > Oleksii Segeda > > > > IT Analyst > > > > Information and Technology Solutions > > > > W > > > > www.worldbank.org<http://www.worldbank.org/> > > > > [http://siteresources.worldbank.org/NEWS/Images/spacer.png] > > > > [http://siteresources.worldbank.org/NEWS/Images/WBG_ > > Information_and_Technology_Solutions.png] > > > > > > > > -------------- next part -------------- An HTML attachment was > > scrubbed... > > URL: http://developer.marklogic.com/pipermail/general/ > > attachments/20170523/c01547ba/attachment.html > > -------------- next part -------------- A non-text attachment was > > scrubbed... > > Name: image003.png > > Type: image/png > > Size: 6577 bytes > > Desc: image003.png > > Url : http://developer.marklogic.com/pipermail/general/ > > attachments/20170523/c01547ba/attachment.png > > -------------- next part -------------- A non-text attachment was > > scrubbed... > > Name: image002.png > > Type: image/png > > Size: 170 bytes > > Desc: image002.png > > Url : http://developer.marklogic.com/pipermail/general/ > > attachments/20170523/c01547ba/attachment-0001.png > > > > ------------------------------ > > > > _______________________________________________ > > General mailing list > > General@developer.marklogic.com > > Manage your subscription at: > > http://developer.marklogic.com/mailman/listinfo/general > > > > > > End of General Digest, Vol 155, Issue 24 > > **************************************** > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: http://developer.marklogic.com/pipermail/general/ > attachments/20170523/bf09ba37/attachment-0001.html > > ------------------------------ > > Message: 2 > Date: Tue, 23 May 2017 13:56:00 +0000 > From: Erik Hennum <erik.hen...@marklogic.com> > Subject: Re: [MarkLogic Dev General] Processing Large Number of Docs > to Get Statistics > To: MarkLogic Developer Discussion <general@developer.marklogic.com> > Message-ID: > <dfdf2fd50bf5aa42adaf93ff2e3ca1850c7f4...@exchg10-be02.marklogic.com> > Content-Type: text/plain; charset="iso-8859-1" > > Hi, Eliot: > > On reflection, let me retract the range index suggestion. I wasn't > considering the domain implied by the element names -- it would never make > sense to blow out a range index with the value of all of the paragraphs. > > The TDE suggestion for MarkLogic 9 would still work, however, because you > could have an xs:short column with a value of 1 for every paragraph. > > > Erik Hennum > > ________________________________________ > From: general-boun...@developer.marklogic.com [general-bounces@developer. > marklogic.com] on behalf of Erik Hennum [erik.hen...@marklogic.com] > Sent: Tuesday, May 23, 2017 6:21 AM > To: MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] Processing Large Number of Docs to > Get Statistics > > Hi, Eliot: > > One alternative to Geert's good suggestion -- if and only if the number of > element names is small and you can create range indexes on them: > > * add an element attribute range index on Article/@id > * add an element range index on p > * execute a cts:value-tuples() call with the constraining element query > and directory query > * iterate over the tuples, incrementing the value of the id in a map > * remove the range index on p > > In MarkLogic 9, that approach gets simpler. You can just use TDE to > project rows with columns for the id and element, group on the id column, > and count the rows in the group. > > Hoping that's useful (and salutations in passing), > > > Erik Hennum > > ________________________________________ > From: general-boun...@developer.marklogic.com [general-bounces@developer. > marklogic.com] on behalf of Geert Josten [geert.jos...@marklogic.com] > Sent: Tuesday, May 23, 2017 12:53 AM > To: MarkLogic Developer Discussion > Subject: Re: [MarkLogic Dev General] Processing Large Number of Docs to > Get Statistics > > Hi Eliot, > > I?d consider using taskbot > (http://registry.demo.marklogic.com/package/taskbot), and using that in > combination with either $tb:OPTIONS-SYNC or $tb:OPTIONS-SYNC-UPDATE. It > will make optimal use of the TaskServer of the host on which you initiate > the call. It doesn?t scale endlessly, but it batches up the work > automatically for you, and will get you a lot further fairly easily.. > > Cheers, > Geert > > On 5/23/17, 5:43 AM, "general-boun...@developer.marklogic.com on behalf > of Eliot Kimber" <general-boun...@developer.marklogic.com on behalf of > ekim...@contrext.com> wrote: > > >I haven?t yet seen anything in the docs that directly address what I?m > >trying to do and suspect I?m simply missing some ML basics or just > >going about things the wrong way. > > > >I have a corpus of several hundred thousand docs (but could be > >millions, of course), where each doc is an average of 200K and several > >thousand elements. > > > >I want to analyze the corpus to get details about the number of > >specific subelements within each document, e.g.: > > > > > >for $article in cts:search(/Article, cts:directory-query("/Default/", > >"infinity"))[$start to $end] > > return <article-counts id=?{$article/@id}? > >paras=?{count($article//p}?/> > > > >I?m running this as a query from Oxygen (so I can capture the results > >locally so I can do other stuff with them). > > > >On the server I?m using I blow the expanded tree cache if I try to > >request more than about 20,000 docs. > > > >Is there a way to do this kind of processing over an arbitrarily large > >set *and* get the results back from a single query request? > > > >I think the only solution is to write the results to back to the > >database and then fetch that as the last thing but I was hoping there > >was something simpler. > > > >Have I missed an obvious solution? > > > >Thanks, > > > >Eliot > > > >-- > >Eliot Kimber > >http://contrext.com > > > > > > > > > >_______________________________________________ > >General mailing list > >General@developer.marklogic.com > >Manage your subscription at: > >http://developer.marklogic.com/mailman/listinfo/general > > _______________________________________________ > General mailing list > General@developer.marklogic.com > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > _______________________________________________ > General mailing list > General@developer.marklogic.com > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > > > ------------------------------ > > _______________________________________________ > General mailing list > General@developer.marklogic.com > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general > > > End of General Digest, Vol 155, Issue 27 > **************************************** > This electronic mail (including any attachments) may contain information > that is privileged, confidential, and/or otherwise protected from > disclosure to anyone other than its intended recipient(s). Any > dissemination or use of this electronic email or its contents (including > any attachments) by persons other than the intended recipient(s) is > strictly prohibited. If you have received this message in error, please > notify the sender by reply email and delete the original message (including > any attachments) in its entirety. > _______________________________________________ > General mailing list > General@developer.marklogic.com > Manage your subscription at: > http://developer.marklogic.com/mailman/listinfo/general >
_______________________________________________ General mailing list General@developer.marklogic.com Manage your subscription at: http://developer.marklogic.com/mailman/listinfo/general