Re: [MASSMAIL]Re: High fieldNorm values causing really odd results

2015-05-18 Thread Jorge Luis Betancourt González
For what I'm seeing I've defined a boost field in my docs, this field is 
defined as float which has the following fieldType: 

 

Is a boost field used by default to boost a document? I couldn't find any 
reference to this behaviour in the docs, only using a boost attribute in the 
doc/field level.

Is this a desired behaviour? 

Regards,

- Original Message -
From: "Jorge Luis Betancourt González" 
To: solr-user@lucene.apache.org
Sent: Thursday, May 14, 2015 11:49:18 PM
Subject: Re: [MASSMAIL]Re: High fieldNorm values causing really odd results

Regarding the experiment, sorry If I explained myself in the wrong way, the 
indexed document doesn't have 119669 terms have a lot less terms (less than a 
1000 terms, I don't have the exact number here now), instead 119669 is the 
number of distinct terms reported by luke (Top-terms total in the admin 
interface) on the title field. 

This index was built from scratch using 4.10.3 if I'm no remembering 
incorrectly. Perhaps part of the data could be indexed using 4.10.2, but we 
updated our box quite some time ago and this problem didn't appear until 
recently. The more strange issue is that this was working fine until a week or 
so ago, the only thing I found strange is that the root partition in our Solr 
box got out of space; basically we've Solr deployed in Tomcat, which is 
installed in the root partition but the cores and all Solr related data is 
stored in a separated partition mounted in /opt with plenty of space to grow; 
could this be the cause of this behavior? 

We're thinking on rebuilding our index, but would love to avoid it if possible 
and more importantly find the root cause if this issue (if is possible at all).

As I said before very grateful for your responses,

- Original Message -
From: "Chris Hostetter" 
To: solr-user@lucene.apache.org
Sent: Thursday, May 14, 2015 7:11:08 PM
Subject: Re: [MASSMAIL]Re: High fieldNorm values causing really odd results


: Sorry for leaving the Solr version out in my previous email, I'm using 
: Solr 4.10.3 running on Centos7, with the following JRE: Oracle 
: Corporation OpenJDK 64-Bit Server VM (1.7.0_75 24.75-b04)

I can't reproduce Using Solr 4.10.3 (or 4.10.4 - mistread your email the 
first time)

Are you certain you didn't *build* this index with a different Similarity 
configured? or did you perhaps build it with an older version of Solr that 
might have had a bug in it?

Here's what i tried...

applied this patch to the example configs based on the fieldType you 
specified...

hossman@tray:~/lucene/lucene_solr_4_10_3_tag$ svn diff
Index: solr/example/solr/collection1/conf/schema.xml
===
--- solr/example/solr/collection1/conf/schema.xml   (revision 1679472)
+++ solr/example/solr/collection1/conf/schema.xml   (working copy)
@@ -46,6 +46,21 @@
 -->
 
 
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
   

Re: [MASSMAIL]Re: High fieldNorm values causing really odd results

2015-05-14 Thread Jorge Luis Betancourt González
Regarding the experiment, sorry If I explained myself in the wrong way, the 
indexed document doesn't have 119669 terms have a lot less terms (less than a 
1000 terms, I don't have the exact number here now), instead 119669 is the 
number of distinct terms reported by luke (Top-terms total in the admin 
interface) on the title field. 

This index was built from scratch using 4.10.3 if I'm no remembering 
incorrectly. Perhaps part of the data could be indexed using 4.10.2, but we 
updated our box quite some time ago and this problem didn't appear until 
recently. The more strange issue is that this was working fine until a week or 
so ago, the only thing I found strange is that the root partition in our Solr 
box got out of space; basically we've Solr deployed in Tomcat, which is 
installed in the root partition but the cores and all Solr related data is 
stored in a separated partition mounted in /opt with plenty of space to grow; 
could this be the cause of this behavior? 

We're thinking on rebuilding our index, but would love to avoid it if possible 
and more importantly find the root cause if this issue (if is possible at all).

As I said before very grateful for your responses,

- Original Message -
From: "Chris Hostetter" 
To: solr-user@lucene.apache.org
Sent: Thursday, May 14, 2015 7:11:08 PM
Subject: Re: [MASSMAIL]Re: High fieldNorm values causing really odd results


: Sorry for leaving the Solr version out in my previous email, I'm using 
: Solr 4.10.3 running on Centos7, with the following JRE: Oracle 
: Corporation OpenJDK 64-Bit Server VM (1.7.0_75 24.75-b04)

I can't reproduce Using Solr 4.10.3 (or 4.10.4 - mistread your email the 
first time)

Are you certain you didn't *build* this index with a different Similarity 
configured? or did you perhaps build it with an older version of Solr that 
might have had a bug in it?

Here's what i tried...

applied this patch to the example configs based on the fieldType you 
specified...

hossman@tray:~/lucene/lucene_solr_4_10_3_tag$ svn diff
Index: solr/example/solr/collection1/conf/schema.xml
===
--- solr/example/solr/collection1/conf/schema.xml   (revision 1679472)
+++ solr/example/solr/collection1/conf/schema.xml   (working copy)
@@ -46,6 +46,21 @@
 -->
 
 
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
   

Re: [MASSMAIL]Re: High fieldNorm values causing really odd results

2015-05-14 Thread Chris Hostetter

: Sorry for leaving the Solr version out in my previous email, I'm using 
: Solr 4.10.3 running on Centos7, with the following JRE: Oracle 
: Corporation OpenJDK 64-Bit Server VM (1.7.0_75 24.75-b04)

I can't reproduce Using Solr 4.10.3 (or 4.10.4 - mistread your email the 
first time)

Are you certain you didn't *build* this index with a different Similarity 
configured? or did you perhaps build it with an older version of Solr that 
might have had a bug in it?

Here's what i tried...

applied this patch to the example configs based on the fieldType you 
specified...

hossman@tray:~/lucene/lucene_solr_4_10_3_tag$ svn diff
Index: solr/example/solr/collection1/conf/schema.xml
===
--- solr/example/solr/collection1/conf/schema.xml   (revision 1679472)
+++ solr/example/solr/collection1/conf/schema.xml   (working copy)
@@ -46,6 +46,21 @@
 -->
 
 
+
+
+
+
+
+
+
+
+
+
+
+
+
+  
   

Re: [MASSMAIL]Re: High fieldNorm values causing really odd results

2015-05-14 Thread Jorge Luis Betancourt González
Hi Hoss,

First of all, thank you for your reply.

Sorry for leaving the Solr version out in my previous email, I'm using Solr 
4.10.3 running on Centos7, with the following JRE: Oracle Corporation OpenJDK 
64-Bit Server VM (1.7.0_75 24.75-b04)

This are the relevant portions of my schema.xml















In this particular case I'm not using any special features, just a typical text 
field. I'm using the default similarity class provided by Solr, this is a 
pretty straightforward setup :)

Regards,

- Original Message -
From: "Chris Hostetter" 
To: solr-user@lucene.apache.org
Sent: Thursday, May 14, 2015 4:08:36 PM
Subject: [MASSMAIL]Re: High fieldNorm values causing really odd results


:   {
:  "match":true,
:  "value":655360,
:  "description":"fieldNorm(doc=5316)"
:   }
...
: This match is in the "title" field, which has 119669 total terms (which 
: isn't such big number) and the total document count in this index is 

that smells like a bug -- by the looks of it an overflow bug?

can you please provide some details on the version of solr you are using, 
and the specifics of your schema: what field type, what similarity 
configuration you have (if any) etc...


-Hoss
http://www.lucidworks.com/