Re: [MarkLogic Dev General] word-query including punctuation characters

2016-06-30 Thread Wissam Asfahani (TSO GB)
Using fields won't be an option for our usage case, but arranging things to use 
value queries may be.

Is it possible to re-classify these characters as symbols or words, without 
using field tokenizer overrides? For example, by modifying the tokenizer.xml 
file?

Wissam

-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Mary Holstege
Sent: 29 June 2016 17:42
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] word-query including punctuation characters

On Wed, 29 Jun 2016 08:06:35 -0700, Wissam Asfahani (TSO GB) 
<wissam.asfah...@tso.co.uk> wrote:

> Good afternoon,
>
> We are having some issues estimating the number of documents when
> performing word queries containing punctuation characters.
>
> I have attached 4 sample documents. When using the below query, the
> estimate returns 3 and the count 1.
>
> Are there any db configuration settings we can use to ensure a more
> accurate estimate result?
>
>
> let $query := cts:word-query("4µ", ("exact"), 2)
>
> return
>   (
> xdmp:estimate(cts:search(fn:doc(), $query)),
> fn:count(cts:search(fn:doc(), $query))
>   )
>
>
> Wissam Asfahani
> XML Developer
>

Punctuation is not indexed in the word query indexes. An exact unwildcarded 
*value* query will consider punctuation, so if you can arrange things so that 
you can use a value query, that could be a solution. If it is just this 
character and searching for it in this way is confined to identifiable parts of 
the document, you could use field tokenizer overrides to redefine µ as a word  
or symbol character for that field.  But it looks like it is being classified 
as a punctuation mark in
error: it should be classified as a letter character anyway since it is listed 
as Ll in the Unicode tables.

//Mary
___
General mailing list
General@developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general


This e-mail has been scanned for all viruses by Claranet. The service is 
powered by MessageLabs. For more information on a proactive anti-virus service 
working around the clock, around the globe, visit:
http://www.claranet.co.uk



GOGREEN Climate Protection with DHL: please consider your environmental 
responsibility before printing this email.

This email is intended exclusively for the individual or entity to which it is 
addressed. This communication may contain information that is proprietary, 
privileged or confidential. If you are not the named addressee, you are not 
authorized to read, print, retain, copy or disseminate this message or any part 
of it. If you have received this message in error, please notify the sender 
immediately by email and delete all copies of the message.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] word-query including punctuation characters

2016-06-29 Thread Mary Holstege
On Wed, 29 Jun 2016 08:06:35 -0700, Wissam Asfahani (TSO GB)  
 wrote:

> Good afternoon,
>
> We are having some issues estimating the number of documents when  
> performing word queries containing punctuation characters.
>
> I have attached 4 sample documents. When using the below query, the  
> estimate returns 3 and the count 1.
>
> Are there any db configuration settings we can use to ensure a more  
> accurate estimate result?
>
>
> let $query := cts:word-query("4µ", ("exact"), 2)
>
> return
>   (
> xdmp:estimate(cts:search(fn:doc(), $query)),
> fn:count(cts:search(fn:doc(), $query))
>   )
>
>
> Wissam Asfahani
> XML Developer
>

Punctuation is not indexed in the word query indexes. An exact  
unwildcarded *value* query will consider punctuation, so if you can  
arrange things so that you can use a value query, that could be a  
solution. If it is just this character and searching for it in this way is  
confined to identifiable parts of the document, you could use field  
tokenizer overrides to redefine µ as a word  or symbol character for that  
field.  But it looks like it is being classified as a punctuation mark in  
error: it should be classified as a letter character anyway since it is  
listed as Ll in the Unicode tables.

//Mary
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] word-query including punctuation characters

2016-06-29 Thread Wissam Asfahani (TSO GB)
Good afternoon,

We are having some issues estimating the number of documents when performing 
word queries containing punctuation characters.

I have attached 4 sample documents. When using the below query, the estimate 
returns 3 and the count 1.

Are there any db configuration settings we can use to ensure a more accurate 
estimate result?


let $query := cts:word-query("4µ", ("exact"), 2)

return
  (
xdmp:estimate(cts:search(fn:doc(), $query)),
fn:count(cts:search(fn:doc(), $query))
  )


Wissam Asfahani
XML Developer

[Description: cid:image003.png@01D1A544.A52612B0]

Williams Lea Tag
Parliamentary Press
Mandela Way
London
SE1 5SS
United Kingdom
T: +44 (0)20 7873 8713
wissam,asfah...@tso.co.uk 
www.tso.co.uk
www.williamslea.com



GOGREEN Climate Protection with DHL: please consider your environmental 
responsibility before printing this email.

This email is intended exclusively for the individual or entity to which it is 
addressed. This communication may contain information that is proprietary, 
privileged or confidential. If you are not the named addressee, you are not 
authorized to read, print, retain, copy or disseminate this message or any part 
of it. If you have received this message in error, please notify the sender 
immediately by email and delete all copies of the message.
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Word Query - Excluded element Question

2015-08-27 Thread Yang, Yun
Thank you very much Mary for the explanations. Will keep your suggestion in 
mind.

-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Mary Holstege
Sent: Wednesday, August 26, 2015 9:15 AM
To: MarkLogic Developer Discussion
Cc: Yang, Yun
Subject: Re: [MarkLogic Dev General] Word Query - Excluded element Question

On Wed, 26 Aug 2015 00:49:17 -0700, David Ennis david.en...@hinttech.com
wrote:

 I was hoping someone would have a better answer before I replied, but 
 here is my response.  Hopefully others will clarify  / build on it.


 I do not think this will make a difference.  The reason being that I 
 understand that the way excluded and included elements actually works 
 is related to traversing the tree while creating the word indexes and 
 including or excluding parts of the tree in the indexing step.  Or 
 even if this statement inaccurate in some way, it is still related to 
 the analysis of the trees (even the tree structure in ML is a type of 
 internal index like a term list, but element A pointing to parent B 
 rather than a term to a fragment.

 So, with the includes and excludes all related to the word queries and 
 the way the tree was indexed, I don't see how any range indexes will 
 help this.

 Perhaps someone will debunk this understanding and/or suggest some 
 magic combination of other approaches. Perhaps there is another way of 
 creating a field with an xpath expression of the ones to include and 
 using a tuned field-word-query on that or similiar.


You are correct: the excludes are processed at index time while we are  
walking the tree. At query time, we are just looking up word keys, and if  
the element was excluded there will be no word keys for words in that  
element in the index, so there will be no match. Adding a range index on  
that attribute will only create more work at indexing time and will do  
nothing at query time.  There is no intrinsic issue with there being a lot  
of documents with the exclusions -- the overhead of applying them at  
indexing is small, and in fact if there are large chunks of documents  
being excluded, could be a net performance enhancement (as well as saving  
space in the index) and at query time there is zero overhead -- again, a  
net savings because there are fewer matches to consider.

The danger with excludes on the word query field is that it disables  
certain optimizations that rely on word positions so you get false  
positives on element queries of various kinds. If you aren't using  
positions anyway it won't matter. If you want to make sure we can still do  
those optimizations (in the most recent releases of 7 and 8) you need to  
also define the excluded elements as phrase-arounds. However, you can't  
put an attribute/value condition on a phrase-around, so you can't do that  
for your case.
One possibility is to use a named field with the exclusions and do  
everything as field-word queries instead of word queries and remove the  
exclusions from the word field. That will cost you time and space at  
indexing time, however.

//Mary

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Word Query - Excluded element Question

2015-08-26 Thread David Ennis
I was hoping someone would have a better answer before I replied, but here
is my response.  Hopefully others will clarify  / build on it.


I do not think this will make a difference.  The reason being that I
understand that the way excluded and included elements actually works is
related to traversing the tree while creating the word indexes and
including or excluding parts of the tree in the indexing step.  Or even if
this statement inaccurate in some way, it is still related to the analysis
of the trees (even the tree structure in ML is a type of internal index
like a term list, but element A pointing to parent B rather than a term to
a fragment.

So, with the includes and excludes all related to the word queries and the
way the tree was indexed, I don't see how any range indexes will help this.

Perhaps someone will debunk this understanding and/or suggest some magic
combination of other approaches. Perhaps there is another way of creating a
field with an xpath expression of the ones to include and using a tuned
field-word-query on that or similiar.







Kind Regards,
David Ennis


David Ennis
*Content Engineer*

[image: HintTech]  http://www.hinttech.com/
Mastering the value of content
creative | technology | content

Delftechpark 37i
2628 XJ Delft
The Netherlands
T: +31 88 268 25 00
M: +31 63 091 72 80

[image: http://www.hinttech.com] http://www.hinttech.com
https://twitter.com/HintTech  http://www.facebook.com/HintTech
http://www.linkedin.com/company/HintTech

On 26 August 2015 at 04:45, Yang, Yun yun.y...@wolterskluwer.com wrote:

 Any advise? Appreciate the help.

 Sent from my iPhone

 On Aug 25, 2015, at 3:23 PM, Yang, Yun yun.y...@wolterskluwer.com wrote:

 All,



 We use Word Query in our application. For the word query setting, in the
 excluded elements section, we have excluded one attribute based on the
 value. For example, in element *Content, *if attribute* Indexing = false,
 *we excluded from the word search. There are only two values in the
 attribute, true and false.



 *Question:*

 Does it worth to create an attribute range index for two distinct values
 to speed up the search? We have millions of the docs meet the condition for
 exclusion.



 *XML Sample:*

 Content Indexing = “false”/





 Thanks,



 Yun Yang




 ___
 General mailing list
 General@developer.marklogic.com
 Manage your subscription at:
 http://developer.marklogic.com/mailman/listinfo/general


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Word Query - Excluded element Question

2015-08-26 Thread Yang, Yun
Thanks David, really appreciate the detailed explanation. Will see if we can 
try some suggestions here.

Thanks again,

Yun

Sent from my iPhone

On Aug 26, 2015, at 2:49 AM, David Ennis 
david.en...@hinttech.commailto:david.en...@hinttech.com wrote:

I was hoping someone would have a better answer before I replied, but here is 
my response.  Hopefully others will clarify  / build on it.


I do not think this will make a difference.  The reason being that I understand 
that the way excluded and included elements actually works is related to 
traversing the tree while creating the word indexes and including or excluding 
parts of the tree in the indexing step.  Or even if this statement inaccurate 
in some way, it is still related to the analysis of the trees (even the tree 
structure in ML is a type of internal index like a term list, but element A 
pointing to parent B rather than a term to a fragment.

So, with the includes and excludes all related to the word queries and the way 
the tree was indexed, I don't see how any range indexes will help this.

Perhaps someone will debunk this understanding and/or suggest some magic 
combination of other approaches. Perhaps there is another way of creating a 
field with an xpath expression of the ones to include and using a tuned 
field-word-query on that or similiar.







Kind Regards,
David Ennis


David Ennis
Content Engineer

[HintTech] http://www.hinttech.com/
Mastering the value of content
creative | technology | content

Delftechpark 37i
2628 XJ Delft
The Netherlands
T: +31 88 268 25 00
M: +31 63 091 72 80

[http://www.hinttech.com]http://www.hinttech.com 
[http://www.hinttech.com/signature/Twitter_HintTech.png] 
https://twitter.com/HintTech  
[http://www.hinttech.com/signature/Facebook_HintTech.png] 
http://www.facebook.com/HintTech  
[http://www.hinttech.com/signature/Linkedin_HintTech.png] 
http://www.linkedin.com/company/HintTech

On 26 August 2015 at 04:45, Yang, Yun 
yun.y...@wolterskluwer.commailto:yun.y...@wolterskluwer.com wrote:
Any advise? Appreciate the help.

Sent from my iPhone

On Aug 25, 2015, at 3:23 PM, Yang, Yun 
yun.y...@wolterskluwer.commailto:yun.y...@wolterskluwer.com wrote:

All,

We use Word Query in our application. For the word query setting, in the 
excluded elements section, we have excluded one attribute based on the value. 
For example, in element Content, if attribute Indexing = false, we excluded 
from the word search. There are only two values in the attribute, true and 
false.

Question:
Does it worth to create an attribute range index for two distinct values to 
speed up the search? We have millions of the docs meet the condition for 
exclusion.

XML Sample:
Content Indexing = “false”/


Thanks,

Yun Yang


___
General mailing list
General@developer.marklogic.commailto:General@developer.marklogic.com
Manage your subscription at:
http://developer.marklogic.com/mailman/listinfo/general


___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Word Query - Excluded element Question

2015-08-26 Thread Mary Holstege
On Wed, 26 Aug 2015 00:49:17 -0700, David Ennis david.en...@hinttech.com  
wrote:

 I was hoping someone would have a better answer before I replied, but  
 here
 is my response.  Hopefully others will clarify  / build on it.


 I do not think this will make a difference.  The reason being that I
 understand that the way excluded and included elements actually works is
 related to traversing the tree while creating the word indexes and
 including or excluding parts of the tree in the indexing step.  Or even  
 if
 this statement inaccurate in some way, it is still related to the  
 analysis
 of the trees (even the tree structure in ML is a type of internal index
 like a term list, but element A pointing to parent B rather than a term  
 to
 a fragment.

 So, with the includes and excludes all related to the word queries and  
 the
 way the tree was indexed, I don't see how any range indexes will help  
 this.

 Perhaps someone will debunk this understanding and/or suggest some magic
 combination of other approaches. Perhaps there is another way of  
 creating a
 field with an xpath expression of the ones to include and using a tuned
 field-word-query on that or similiar.


You are correct: the excludes are processed at index time while we are  
walking the tree. At query time, we are just looking up word keys, and if  
the element was excluded there will be no word keys for words in that  
element in the index, so there will be no match. Adding a range index on  
that attribute will only create more work at indexing time and will do  
nothing at query time.  There is no intrinsic issue with there being a lot  
of documents with the exclusions -- the overhead of applying them at  
indexing is small, and in fact if there are large chunks of documents  
being excluded, could be a net performance enhancement (as well as saving  
space in the index) and at query time there is zero overhead -- again, a  
net savings because there are fewer matches to consider.

The danger with excludes on the word query field is that it disables  
certain optimizations that rely on word positions so you get false  
positives on element queries of various kinds. If you aren't using  
positions anyway it won't matter. If you want to make sure we can still do  
those optimizations (in the most recent releases of 7 and 8) you need to  
also define the excluded elements as phrase-arounds. However, you can't  
put an attribute/value condition on a phrase-around, so you can't do that  
for your case.
One possibility is to use a named field with the exclusions and do  
everything as field-word queries instead of word queries and remove the  
exclusions from the word field. That will cost you time and space at  
indexing time, however.

//Mary

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Word Query - Excluded element Question

2015-08-25 Thread Yang, Yun
All,

We use Word Query in our application. For the word query setting, in the 
excluded elements section, we have excluded one attribute based on the value. 
For example, in element Content, if attribute Indexing = false, we excluded 
from the word search. There are only two values in the attribute, true and 
false.

Question:
Does it worth to create an attribute range index for two distinct values to 
speed up the search? We have millions of the docs meet the condition for 
exclusion.

XML Sample:
Content Indexing = false/


Thanks,

Yun Yang

___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Word Query Settings

2015-07-30 Thread Andreas Hubmer
Hi,

there is something which has surprised me so I'd like to share it. The
Word Query settings of a database do not only influence cts:word-query
searches, but also cts:element-word-query searches. The former are
mentioned on the help page, the latter not and so I thought there would be
a difference.

In my case element-word-queries worked well, but not for phrases consisting
of more than 2 words. I could fix the problem by adding the queried element
to the list of included elements.

Regards,
Andreas

-- 
Andreas Hubmer
IT Consultant
___
General mailing list
General@developer.marklogic.com
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] word query

2014-10-17 Thread Girish Kulkarni
The word Query problem was on the Marklogic version 7.0-2.1. It works fine
on 7.0-3.

Girish

On Thu, Oct 16, 2014 at 9:45 AM, Danny Sokolsky 
danny.sokol...@marklogic.com wrote:

  Have you tried this on 7.0-4?  There are some bugs fixed in this area
 there—it is worth a try.



 What I recommend you do is create a simple test case using a simple
 cts:search with a cts:query that shows the issue, then post that here with
 the exact config info for it.  Otherwise we are just guessing.  The details
 are very important to understanding what is going on.



 -Danny



 *From:* general-boun...@developer.marklogic.com [mailto:
 general-boun...@developer.marklogic.com] *On Behalf Of *Girish Kulkarni
 *Sent:* Thursday, October 16, 2014 9:39 AM

 *To:* MarkLogic Developer Discussion
 *Subject:* Re: [MarkLogic Dev General] word query



 i am using word-constraint-query in my structured query and i see that
 when i exclude the enrichedDateTime , it does work in the sense i dont
 see any results when i search for the timestamp search i was seeing
 earlier. Now my only problem is i don't see other fields being searched.
 Well to be more specific i see that the results returned show total = 20
 but i don't see anything being returned . As soon as i add the top level
 field fix in the inclusion list i start seeing the 20 results.



 Girish



 On Wed, Oct 15, 2014 at 4:09 PM, Danny Sokolsky 
 danny.sokol...@marklogic.com wrote:

 Depending on what those structured queries are, they might not be word
 queries.  Excluding an element in the word query field does not mean you
 cannot query it, it just means that a cts:word-query will not see it.  For
 example, you can still see it in an element-word-query.



 You say your reindexing started; did it complete?



 -Danny



 *From:* general-boun...@developer.marklogic.com [mailto:
 general-boun...@developer.marklogic.com] *On Behalf Of *Girish Kulkarni
 *Sent:* Wednesday, October 15, 2014 3:59 PM
 *To:* MarkLogic Developer Discussion
 *Subject:* Re: [MarkLogic Dev General] word query



 As soon as i make changes to the word query , my database starts
 re-indexing.



 We use marklogic 7.0-3 and we are using REST based structured queries. The
 query passes in a bunch of constraints and options , however i am just
 testing before and after scenario's based upon changes to the word query.



 Girish



 On Wed, Oct 15, 2014 at 3:55 PM, Danny Sokolsky 
 danny.sokol...@marklogic.com wrote:

 Did you reindex your database after changing the word query field?



 Exactly what query are you running?



 What version of MarkLogic are you using (xdmp:version() ).



 -Danny



 *From:* general-boun...@developer.marklogic.com [mailto:
 general-boun...@developer.marklogic.com] *On Behalf Of *Girish Kulkarni
 *Sent:* Wednesday, October 15, 2014 3:48 PM
 *To:* MarkLogic Developer Discussion
 *Subject:* [MarkLogic Dev General] word query



 I had some fileds in my xml document like enrichedDateTime which i didnt
 want to index and search upon. When i added this in the word query
 exclusion list for some reason my search result isn't returning back this
 document at all even when i searched for other field like content.
 However when i added the  root field name fix to my inclusion list i do
 see the document back again. I had already set the include root flag to
 true but seems like for some reason i am un-able to search for other fields
 in the document. Any ideas why this could be happening ?



 fix
 content some content goes here /content
 enrichedDateTime2014-09-30T16:32:27.424443-07:00/enrichedDateTime
 /fix







 Girish Kulkarni


 ___
 General mailing list
 General@developer.marklogic.com
 http://developer.marklogic.com/mailman/listinfo/general





 --
 Girish Kulkarni


 ___
 General mailing list
 General@developer.marklogic.com
 http://developer.marklogic.com/mailman/listinfo/general





 --
 Girish Kulkarni

 ___
 General mailing list
 General@developer.marklogic.com
 http://developer.marklogic.com/mailman/listinfo/general




-- 
Girish Kulkarni
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] word query

2014-10-16 Thread Girish Kulkarni
i am using word-constraint-query in my structured query and i see that when
i exclude the enrichedDateTime , it does work in the sense i dont see any
results when i search for the timestamp search i was seeing earlier. Now my
only problem is i don't see other fields being searched. Well to be more
specific i see that the results returned show total = 20 but i don't see
anything being returned . As soon as i add the top level field fix in the
inclusion list i start seeing the 20 results.

Girish

On Wed, Oct 15, 2014 at 4:09 PM, Danny Sokolsky 
danny.sokol...@marklogic.com wrote:

  Depending on what those structured queries are, they might not be word
 queries.  Excluding an element in the word query field does not mean you
 cannot query it, it just means that a cts:word-query will not see it.  For
 example, you can still see it in an element-word-query.



 You say your reindexing started; did it complete?



 -Danny



 *From:* general-boun...@developer.marklogic.com [mailto:
 general-boun...@developer.marklogic.com] *On Behalf Of *Girish Kulkarni
 *Sent:* Wednesday, October 15, 2014 3:59 PM
 *To:* MarkLogic Developer Discussion
 *Subject:* Re: [MarkLogic Dev General] word query



 As soon as i make changes to the word query , my database starts
 re-indexing.



 We use marklogic 7.0-3 and we are using REST based structured queries. The
 query passes in a bunch of constraints and options , however i am just
 testing before and after scenario's based upon changes to the word query.



 Girish



 On Wed, Oct 15, 2014 at 3:55 PM, Danny Sokolsky 
 danny.sokol...@marklogic.com wrote:

 Did you reindex your database after changing the word query field?



 Exactly what query are you running?



 What version of MarkLogic are you using (xdmp:version() ).



 -Danny



 *From:* general-boun...@developer.marklogic.com [mailto:
 general-boun...@developer.marklogic.com] *On Behalf Of *Girish Kulkarni
 *Sent:* Wednesday, October 15, 2014 3:48 PM
 *To:* MarkLogic Developer Discussion
 *Subject:* [MarkLogic Dev General] word query



 I had some fileds in my xml document like enrichedDateTime which i didnt
 want to index and search upon. When i added this in the word query
 exclusion list for some reason my search result isn't returning back this
 document at all even when i searched for other field like content.
 However when i added the  root field name fix to my inclusion list i do
 see the document back again. I had already set the include root flag to
 true but seems like for some reason i am un-able to search for other fields
 in the document. Any ideas why this could be happening ?



 fix
 content some content goes here /content
 enrichedDateTime2014-09-30T16:32:27.424443-07:00/enrichedDateTime
 /fix







 Girish Kulkarni


 ___
 General mailing list
 General@developer.marklogic.com
 http://developer.marklogic.com/mailman/listinfo/general





 --
 Girish Kulkarni

 ___
 General mailing list
 General@developer.marklogic.com
 http://developer.marklogic.com/mailman/listinfo/general




-- 
Girish Kulkarni
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] word query

2014-10-16 Thread Danny Sokolsky
Have you tried this on 7.0-4?  There are some bugs fixed in this area there—it 
is worth a try.

What I recommend you do is create a simple test case using a simple cts:search 
with a cts:query that shows the issue, then post that here with the exact 
config info for it.  Otherwise we are just guessing.  The details are very 
important to understanding what is going on.

-Danny

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Girish Kulkarni
Sent: Thursday, October 16, 2014 9:39 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] word query

i am using word-constraint-query in my structured query and i see that when i 
exclude the enrichedDateTime , it does work in the sense i dont see any results 
when i search for the timestamp search i was seeing earlier. Now my only 
problem is i don't see other fields being searched. Well to be more specific i 
see that the results returned show total = 20 but i don't see anything being 
returned . As soon as i add the top level field fix in the inclusion list i 
start seeing the 20 results.

Girish

On Wed, Oct 15, 2014 at 4:09 PM, Danny Sokolsky 
danny.sokol...@marklogic.commailto:danny.sokol...@marklogic.com wrote:
Depending on what those structured queries are, they might not be word queries. 
 Excluding an element in the word query field does not mean you cannot query 
it, it just means that a cts:word-query will not see it.  For example, you can 
still see it in an element-word-query.

You say your reindexing started; did it complete?

-Danny

From: 
general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com
 
[mailto:general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com]
 On Behalf Of Girish Kulkarni
Sent: Wednesday, October 15, 2014 3:59 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] word query

As soon as i make changes to the word query , my database starts re-indexing.

We use marklogic 7.0-3 and we are using REST based structured queries. The 
query passes in a bunch of constraints and options , however i am just testing 
before and after scenario's based upon changes to the word query.

Girish

On Wed, Oct 15, 2014 at 3:55 PM, Danny Sokolsky 
danny.sokol...@marklogic.commailto:danny.sokol...@marklogic.com wrote:
Did you reindex your database after changing the word query field?

Exactly what query are you running?

What version of MarkLogic are you using (xdmp:version() ).

-Danny

From: 
general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com
 
[mailto:general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com]
 On Behalf Of Girish Kulkarni
Sent: Wednesday, October 15, 2014 3:48 PM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] word query

I had some fileds in my xml document like enrichedDateTime which i didnt want 
to index and search upon. When i added this in the word query exclusion list 
for some reason my search result isn't returning back this document at all even 
when i searched for other field like content. However when i added the  root 
field name fix to my inclusion list i do see the document back again. I had 
already set the include root flag to true but seems like for some reason i am 
un-able to search for other fields in the document. Any ideas why this could be 
happening ?

fix
content some content goes here /content
enrichedDateTime2014-09-30T16:32:27.424443-07:00/enrichedDateTime
/fix



Girish Kulkarni

___
General mailing list
General@developer.marklogic.commailto:General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general



--
Girish Kulkarni

___
General mailing list
General@developer.marklogic.commailto:General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general



--
Girish Kulkarni
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] word query

2014-10-15 Thread Girish Kulkarni
I had some fileds in my xml document like enrichedDateTime which i didnt
want to index and search upon. When i added this in the word query
exclusion list for some reason my search result isn't returning back this
document at all even when i searched for other field like content.
However when i added the  root field name fix to my inclusion list i do
see the document back again. I had already set the include root flag to
true but seems like for some reason i am un-able to search for other fields
in the document. Any ideas why this could be happening ?

fix
content some content goes here /content
enrichedDateTime2014-09-30T16:32:27.424443-07:00/enrichedDateTime
/fix



Girish Kulkarni
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] word query

2014-10-15 Thread Danny Sokolsky
Did you reindex your database after changing the word query field?

Exactly what query are you running?

What version of MarkLogic are you using (xdmp:version() ).

-Danny

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Girish Kulkarni
Sent: Wednesday, October 15, 2014 3:48 PM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] word query

I had some fileds in my xml document like enrichedDateTime which i didnt want 
to index and search upon. When i added this in the word query exclusion list 
for some reason my search result isn't returning back this document at all even 
when i searched for other field like content. However when i added the  root 
field name fix to my inclusion list i do see the document back again. I had 
already set the include root flag to true but seems like for some reason i am 
un-able to search for other fields in the document. Any ideas why this could be 
happening ?

fix
content some content goes here /content
enrichedDateTime2014-09-30T16:32:27.424443-07:00/enrichedDateTime
/fix



Girish Kulkarni
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] word query

2014-10-15 Thread Girish Kulkarni
As soon as i make changes to the word query , my database starts
re-indexing.

We use marklogic 7.0-3 and we are using REST based structured queries. The
query passes in a bunch of constraints and options , however i am just
testing before and after scenario's based upon changes to the word query.

Girish

On Wed, Oct 15, 2014 at 3:55 PM, Danny Sokolsky 
danny.sokol...@marklogic.com wrote:

  Did you reindex your database after changing the word query field?



 Exactly what query are you running?



 What version of MarkLogic are you using (xdmp:version() ).



 -Danny



 *From:* general-boun...@developer.marklogic.com [mailto:
 general-boun...@developer.marklogic.com] *On Behalf Of *Girish Kulkarni
 *Sent:* Wednesday, October 15, 2014 3:48 PM
 *To:* MarkLogic Developer Discussion
 *Subject:* [MarkLogic Dev General] word query



 I had some fileds in my xml document like enrichedDateTime which i didnt
 want to index and search upon. When i added this in the word query
 exclusion list for some reason my search result isn't returning back this
 document at all even when i searched for other field like content.
 However when i added the  root field name fix to my inclusion list i do
 see the document back again. I had already set the include root flag to
 true but seems like for some reason i am un-able to search for other fields
 in the document. Any ideas why this could be happening ?



 fix
 content some content goes here /content
 enrichedDateTime2014-09-30T16:32:27.424443-07:00/enrichedDateTime
 /fix







 Girish Kulkarni

 ___
 General mailing list
 General@developer.marklogic.com
 http://developer.marklogic.com/mailman/listinfo/general




-- 
Girish Kulkarni
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


Re: [MarkLogic Dev General] word query

2014-10-15 Thread Danny Sokolsky
Depending on what those structured queries are, they might not be word queries. 
 Excluding an element in the word query field does not mean you cannot query 
it, it just means that a cts:word-query will not see it.  For example, you can 
still see it in an element-word-query.

You say your reindexing started; did it complete?

-Danny

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Girish Kulkarni
Sent: Wednesday, October 15, 2014 3:59 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] word query

As soon as i make changes to the word query , my database starts re-indexing.

We use marklogic 7.0-3 and we are using REST based structured queries. The 
query passes in a bunch of constraints and options , however i am just testing 
before and after scenario's based upon changes to the word query.

Girish

On Wed, Oct 15, 2014 at 3:55 PM, Danny Sokolsky 
danny.sokol...@marklogic.commailto:danny.sokol...@marklogic.com wrote:
Did you reindex your database after changing the word query field?

Exactly what query are you running?

What version of MarkLogic are you using (xdmp:version() ).

-Danny

From: 
general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com
 
[mailto:general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com]
 On Behalf Of Girish Kulkarni
Sent: Wednesday, October 15, 2014 3:48 PM
To: MarkLogic Developer Discussion
Subject: [MarkLogic Dev General] word query

I had some fileds in my xml document like enrichedDateTime which i didnt want 
to index and search upon. When i added this in the word query exclusion list 
for some reason my search result isn't returning back this document at all even 
when i searched for other field like content. However when i added the  root 
field name fix to my inclusion list i do see the document back again. I had 
already set the include root flag to true but seems like for some reason i am 
un-able to search for other fields in the document. Any ideas why this could be 
happening ?

fix
content some content goes here /content
enrichedDateTime2014-09-30T16:32:27.424443-07:00/enrichedDateTime
/fix



Girish Kulkarni

___
General mailing list
General@developer.marklogic.commailto:General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general



--
Girish Kulkarni
___
General mailing list
General@developer.marklogic.com
http://developer.marklogic.com/mailman/listinfo/general


[MarkLogic Dev General] Word query included element weight

2010-01-15 Thread Shannon
Hi,

If I want to boost the relevance of a title element in the search results in an 
app deployed with Application Builder, would increasing the weight value of 
word query includes within that database be the thing to do? I have tried 
ramping it up, from 3 to 1000 to 9, and it appears to have no effect.

Thanks,
Shannon___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Word query included element weight

2010-01-15 Thread Shannon
Danny,

Thank you for this.

The reindexer is enabled and the throttle is set to 5. It was reindexed, and I 
triggered it manually, just in case. Looked at the status summary to confirm 
that it happened. Happens quickly as the database is only 31MB.

I have not declared a default namespace, so I assume I leave that field blank?

inline: Screen shot 2010-01-15 at 4.59.51 PM.jpg

I changed the weight to 16 and just reindexed.

And _now_ I get what I expected!  

hit score=1080'Cautio Criminalis' Or A Book On Witch Trials/hit
hit score=1080'Cautio Criminalis' Or A Book On Witch Trials/hit
hit score=990Witchcraft and the Papacy/hit
hit score=945I, Tituba, Black Witch of Salem/hit
hit score=855Shaman of Oberstdorf/hit
hit score=855Shaman of Oberstdorf/hit
hit score=855Evil People/hit
hit score=765With Paintbrush and Shovel/hit
hit score=765Virginia Folk Legends/hit

With a value of 9, just an FYI, this query resulted in this:

for $hit at $count in cts:search(
  collection(),
  cts:word-query(witch)
)
return element hit {
  attribute score { cts:score($hit) },
  string($hit//b203)
}

=

hit score=1620Evil People/hit
hit score=1620Witchcraft and the Papacy/hit
hit score=1530With Paintbrush and Shovel/hit
hit score=1530Virginia Folk Legends/hit
hit score=1530'Cautio Criminalis' Or A Book On Witch Trials/hit
hit score=1485Shaman of Oberstdorf/hit
hit score=1485I, Tituba, Black Witch of Salem/hit

Maybe there is a bug in the way the value is being rounded down?

Also, I referenced 13.1.3, Adding a Weight to Boost or Lower the Relevance of 
an Included Element in admin_guide and there is nothing documented about the 
weighting range.  

At any rate, thanks for your help!

Best,
Shannon

On Jan 15, 2010, at 4:40 PM, Danny Sokolsky wrote:

 Hi Shannon,
 
 This does appear to be the right approach.  Did you reindex your database 
 after doing this (or wait for reindexing to complete if reindexing is 
 enabled)?  The word query include weights are built into the indexes.
 
 Also, the weights should be between -16.0 and 16.0; a weight greater than 16 
 is rounded to 16, and a weight less than -16 is rounded to -16.
 
 -Danny 
 
 -Original Message-
 From: general-boun...@developer.marklogic.com 
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Shannon
 Sent: Friday, January 15, 2010 1:26 PM
 To: General Mark Logic Developer Discussion
 Subject: [MarkLogic Dev General] Word query included element weight
 
 Hi,
 
 If I want to boost the relevance of a title element in the search results in 
 an app deployed with Application Builder, would increasing the weight value 
 of word query includes within that database be the thing to do? I have tried 
 ramping it up, from 3 to 1000 to 9, and it appears to have no effect.
 
 Thanks,
 Shannon___
 General mailing list
 General@developer.marklogic.com
 http://xqzone.com/mailman/listinfo/general
 ___
 General mailing list
 General@developer.marklogic.com
 http://xqzone.com/mailman/listinfo/general

-- 
Shannon Scott Shiflett, XML Programmer
ROTUNDA, The University of Virginia Press
PO Box 801079, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: shifl...@virginia.edu   Tel: +1 434 924 4495
Web: http://rotunda.upress.virginia.edu/

___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


RE: [MarkLogic Dev General] Word query included element weight

2010-01-15 Thread Danny Sokolsky
It would need to be in the namespace of your title element.  In the word query 
included element interface,  make sure you specify the proper namespace and 
localname for the included element.

-Danny 

-Original Message-
From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Shannon
Sent: Friday, January 15, 2010 2:10 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] Word query included element weight

Danny,

Thank you for this.

The reindexer is enabled and the throttle is set to 5. It was reindexed, and I 
triggered it manually, just in case. Looked at the status summary to confirm 
that it happened. Happens quickly as the database is only 31MB.

I have not declared a default namespace, so I assume I leave that field blank?

___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


Re: [MarkLogic Dev General] Word query included element weight

2010-01-15 Thread Shannon
Thanks, it's working now, see the rest of my e-mail (sorry it was so verbose). 
I made a suggestion that the valid range should be documented because it wasn't 
rounding down 500 properly.

Best,
Shannon

On Jan 15, 2010, at 5:27 PM, Danny Sokolsky wrote:

 It would need to be in the namespace of your title element.  In the word 
 query included element interface,  make sure you specify the proper namespace 
 and localname for the included element.
 
 -Danny 
 
 -Original Message-
 From: general-boun...@developer.marklogic.com 
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Shannon
 Sent: Friday, January 15, 2010 2:10 PM
 To: General Mark Logic Developer Discussion
 Subject: Re: [MarkLogic Dev General] Word query included element weight
 
 Danny,
 
 Thank you for this.
 
 The reindexer is enabled and the throttle is set to 5. It was reindexed, and 
 I triggered it manually, just in case. Looked at the status summary to 
 confirm that it happened. Happens quickly as the database is only 31MB.
 
 I have not declared a default namespace, so I assume I leave that field blank?
 
 ___
 General mailing list
 General@developer.marklogic.com
 http://xqzone.com/mailman/listinfo/general

-- 
Shannon Scott Shiflett, XML Programmer
ROTUNDA, The University of Virginia Press
PO Box 801079, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: shifl...@virginia.edu   Tel: +1 434 924 4495
Web: http://rotunda.upress.virginia.edu/

___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general


RE: [MarkLogic Dev General] Word query included element weight

2010-01-15 Thread Danny Sokolsky
Hi Shannon,

Sorry, I got the number range wrong on the weight in my previous email.  I was 
incorrectly referring to the cts:query constructor weight arguments, not the 
field (and word-query) weight.

It does not round down to 16, and negative weights are not allowed.

To lower the score, make the value less than 1.0 and greater than 0.  To raise 
the weight, make it greater than 1.0.  Greater numbers boost the score more.

Sorry for the confusion.

-Danny

From: general-boun...@developer.marklogic.com 
[mailto:general-boun...@developer.marklogic.com] On Behalf Of Shannon
Sent: Friday, January 15, 2010 2:29 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] Word query included element weight

Thanks, it's working now, see the rest of my e-mail (sorry it was so verbose). 
I made a suggestion that the valid range should be documented because it wasn't 
rounding down 500 properly.

Best,
Shannon

On Jan 15, 2010, at 5:27 PM, Danny Sokolsky wrote:


It would need to be in the namespace of your title element.  In the word query 
included element interface,  make sure you specify the proper namespace and 
localname for the included element.

-Danny

-Original Message-
From: 
general-boun...@developer.marklogic.commailto:general-boun...@developer.marklogic.com
 [mailto:general-boun...@developer.marklogic.com] On Behalf Of Shannon
Sent: Friday, January 15, 2010 2:10 PM
To: General Mark Logic Developer Discussion
Subject: Re: [MarkLogic Dev General] Word query included element weight

Danny,

Thank you for this.

The reindexer is enabled and the throttle is set to 5. It was reindexed, and I 
triggered it manually, just in case. Looked at the status summary to confirm 
that it happened. Happens quickly as the database is only 31MB.

I have not declared a default namespace, so I assume I leave that field blank?

___
General mailing list
General@developer.marklogic.commailto:General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general

--
Shannon Scott Shiflett, XML Programmer
ROTUNDA, The University of Virginia Press
PO Box 801079, Charlottesville, VA 22904-4318 USA
Courier: 310 Old Ivy Way, Suite 302, Charlottesville VA 22903
Email: shifl...@virginia.edumailto:shifl...@virginia.edu   Tel: +1 434 924 
4495
Web: http://rotunda.upress.virginia.edu/

___
General mailing list
General@developer.marklogic.com
http://xqzone.com/mailman/listinfo/general