[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198360#comment-15198360
 ] 

ASF subversion and git services commented on SOLR-8740:
---

Commit e76fa568172173feeed3eaaf7de06b773b32605d in lucene-solr's branch 
refs/heads/master from [~yo...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=e76fa56 ]

SOLR-8740: use docValues for non-text fields in schema templates


> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
> Attachments: SOLR-8740.patch, SOLR-8740.patch
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198363#comment-15198363
 ] 

ASF subversion and git services commented on SOLR-8740:
---

Commit 97361313582d054ff44a2a3a8e206f313e67d68c in lucene-solr's branch 
refs/heads/branch_6_0 from [~yo...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=9736131 ]

SOLR-8740: use docValues for non-text fields in schema templates


> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
> Attachments: SOLR-8740.patch, SOLR-8740.patch
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-18 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198362#comment-15198362
 ] 

ASF subversion and git services commented on SOLR-8740:
---

Commit 14752476f445436944618a6f1dde9bd787a1f3c9 in lucene-solr's branch 
refs/heads/branch_6x from [~yo...@apache.org]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=1475247 ]

SOLR-8740: use docValues for non-text fields in schema templates


> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
> Attachments: SOLR-8740.patch, SOLR-8740.patch
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-09 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15187900#comment-15187900
 ] 

Erick Erickson commented on SOLR-8740:
--

I added some tests that should serve to flag if the ordering of MV fields is 
different so we can stop discussing it ;) See SOLR-8813

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186237#comment-15186237
 ] 

Shawn Heisey commented on SOLR-8740:


On further consideration, if the schema version is explicitly stated, I guess 
that won't happen.

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Shawn Heisey (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186232#comment-15186232
 ] 

Shawn Heisey commented on SOLR-8740:


bq. At the moment, both indexed and stored flags are false if not set, I 
believe.

The default values for stored and indexed are *true* for all schema versions.  
See this method in FieldType.java, line 153 in particular:

https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=blob;f=solr/core/src/java/org/apache/solr/schema/FieldType.java;h=fbcebcda1b163dd1ba2d9980e1cfd540b2769f13;hb=12f7ad66963a5ae784f2bd0bf8b5dbc4b3c1630e#l150

I think setting the default for docValues to true in schema version 1.7 is 
probably a good idea, but I do predict a lot of "I upgraded and now my index is 
twice as big!" messages on the list. :)


> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15186008#comment-15186008
 ] 

Mark Miller commented on SOLR-8740:
---

There are tradeoffs. They won't be shoved on users though. They can always 
switch back. Given the history of users with OOM and the fieldcache, I think we 
are making a much better user experience default. 

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Alexandre Rafalovitch (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185917#comment-15185917
 ] 

Alexandre Rafalovitch commented on SOLR-8740:
-

{quote}
We could consider changing the default such that even if it doesn't appear in 
the schema, docValues would be true (and bump the schema version of course). I 
had assumed that might be more controversial though, so was just looking to 
effectively change the de-facto defaults.
{quote}

At the moment, both *indexed* and *stored* flags are *false* if not set, I 
believe. making *docValues* to be *true* if not set, could cause confusions. 
So, I am with Yonik's original suggestion of having them enabled as explicit 
flags in the example schemas.

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185581#comment-15185581
 ] 

Tomás Fernández Löbbe commented on SOLR-8740:
-

Also, if we add PointFields (SOLR-8396), we'll need docValues for 
sorting/faceting on them. A basic/unoptimized schema would allow you to sort or 
facet on any of those fields as you can do now with TrieFields and the 
FieldCache/FieldValueCache. Later you can optimize the schema by removing DV 
from the fields where you don't need them. I believe that would be a better 
starting experience than not adding DV and giving errors. 

{quote}
I was thinking of things like changing the *_i dynamic field in our schema 
templates to explicitly include docValues="true" by default (or setting it on 
the fieldType if that's a flag that carries over to all of it's fields by 
default).

We could consider changing the default such that even if it doesn't appear in 
the schema, docValues would be true (and bump the schema version of course). I 
had assumed that might be more controversial though, so was just looking to 
effectively change the de-facto defaults.
{quote}

I'm OK either way, I do think in any case we should leave the 
{{docValues="true"}} in the schema to make it more obvious that you want to 
look at that attribute for optimizing your schema. 

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185492#comment-15185492
 ] 

Erick Erickson commented on SOLR-8740:
--

Tell you what, I'll either write or make sure a test exists that illustrates my 
concern and in any case put in a big fat warning in the code to document if the 
change if it suddenly starts failing.


> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Joel Bernstein (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185350#comment-15185350
 ] 

Joel Bernstein commented on SOLR-8740:
--

+1 

I think this ticket is a pretty important improvement because it means that the 
/export handler (requires docValues) will work out of the box.  The /export 
handler is needed for many features in Streaming Expressions and Paralllel SQL. 
So having docValues on by default makes these feature easier to use.

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185336#comment-15185336
 ] 

Jack Krupansky commented on SOLR-8740:
--

My apologies for any unnecessary noise I may have caused here. I just think 
that every single docValues issue raised for Solr should endeavor to make the 
lives of Solr users a lot easier, not more complicated and even more confusing. 
As things stand, docValues is more of an expert-only feature. The mere fact 
that we can't make docValues uniformly the default illustrates that in spades.

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185031#comment-15185031
 ] 

Yonik Seeley commented on SOLR-8740:


bq. The issue here which Yonik created is about making docValues as default. So 
even if you don't mark a field explicitly with docValues="true" in your schema 
, Solr will enable it. 

Oh, hmmm, that's a good point.  I hadn't actually meant that.  But I now 
realize that the issue title/description is somewhat ambiguous.
As an example, I was thinking of things like changing the *_i dynamic field in 
our schema templates to explicitly include docValues="true" by default (or 
setting it on the fieldType if that's a flag that carries over to all of it's 
fields by default).

We could consider changing the default such that even if it doesn't appear in 
the schema, docValues would be true (and bump the schema version of course).  I 
had assumed that might be more controversial though, so was just looking to 
effectively change the de-facto defaults.

If we can set docValues=true on a fieldType and have it carry over to all 
fields by default, then that's even friendly for fields added via API or 
guessed fields.



> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185002#comment-15185002
 ] 

Varun Thacker commented on SOLR-8740:
-

Can we please not mix up the issues.

The issue here which Yonik created is about making docValues as default. So 
even if you don't mark a field explicitly with docValues="true" in your schema 
, Solr will enable it. You have to explicitly turn it off instead. This does 
not change any behaviour on how fields are returned etc.

Discussions which started with {{I still find the whole docValues vs. Stored 
fields narrative extremely confusing.}} and any followups on this are 
tangential to this issue. Solr now supports two way to return back the actual 
field values. SOLR-8220 has all the details and there is a very detailed entry 
about it in the CHANGES file. Can we please have any followup discussions on 
that on another mailing list discussion/SOLR-8220 Jira

{code}
* The Solr schema version has been increased to 1.6. Since schema version 1.6, 
all non-stored docValues fields
  will be returned along with other stored fields when all fields (or pattern 
matching globs) are specified
  to be returned (e.g. fl=*) for search queries. This behavior can be turned on 
and off by setting
  'useDocValuesAsStored' parameter for a field or a field type to true (default 
since schema version 1.6)
  or false (default till schema version 1.5).
  Note that enabling this property has performance implications because 
DocValues are column-oriented and may
  therefore incur additional cost to retrieve for each returned document. All 
example schema are upgraded to
  version 1.6 but any older schemas will default to useDocValuesAsStored=false 
and continue to work as in
  older versions of Solr. If this new behavior is desirable, then you should 
set version attribute in your
  schema file to '1.6'. Re-indexing is not necessary to upgrade the schema 
version.
  Also note that while returning non-stored fields from docValues (default in 
schema versions 1.6+, unless
  useDocValuesAsStored is false), the values of a multi-valued field are 
returned in sorted order.
  If you require the multi-valued fields to be returned in the original 
insertion order, then make your
  multi-valued field as stored. This requires re-indexing.
  See SOLR-8220 for more details.
{code}

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184997#comment-15184997
 ] 

Yonik Seeley commented on SOLR-8740:


This issue is really only about changing what our starting schemas look like I 
think, and won't change any behavior.  That's already been done in other issues.


> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Simon Rosenthal (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184988#comment-15184988
 ] 

Simon Rosenthal commented on SOLR-8740:
---

If this is adopted, it  needs to be clearly documented that DocValues do not 
retain ordering in multivalued fields whereas stored fields do. Our use case -  
picking  first and last authors from a multivalued 'authors' String field.


> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-08 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15184957#comment-15184957
 ] 

Yonik Seeley commented on SOLR-8740:


bq. And default to docValuesFormat="Memory" as well, or is that already the 
default when docValues="true" is set?

The ability to keep everything off-heap and accessed via memory map seems to be 
the default we want (i.e.the docValues default of "disk").
Sorting and faceting on docValues fields will most likely be a little slower 
(ignoring NRT) but the real benefit is having things "just work" and 
avoiding the dreaded "I got an OOM exception when I tried to sort on this 
field" issues.  So we're going to want them by default on non-text fields that 
people will use for sorting, faceting, stats, etc.

I'll try and make some time to tackle this issue soon (sometime after the 
Lucene/Solr NYC meetup on Wednesday)

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-07 Thread Adrien Grand (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183357#comment-15183357
 ] 

Adrien Grand commented on SOLR-8740:


bq. And default to docValuesFormat="Memory" as well, or is that already the 
default when docValues="true" is set?

Having the default setup not using the default codec looks dangerous to me as 
it means that users won't be able to upgrade clusters without switching back to 
the default codec first (which is the only supported one for backwards 
compatibility).

bq.  I've never been able to figure out why Lucene still needs Stored fields 
(other than for tokenized text fields) if docValues is so much better.

Doc values are not better, they just have different trade-offs: stored fields 
are optimized for randomly getting several values from a couple dozen documents 
while doc values are optimized for sequentially reading a couple values from 
many documents. If you were to replace stored fields with doc values, 
performance would become horrible if your index is significantly larger than 
your filesystem cache, especially if you have spinning disks. I suspect it 
could be fine if it was only done for the version field as suggested above but 
doing it for all fields sounds dangerous to me.

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: [jira] [Commented] (SOLR-8740) use docValues by default

2016-03-07 Thread Erick Erickson
Note that any multivalued fields that rely on the data being returned in
the same order it was added will break since MV fields are returned in
sorted order when they're DV fields. Probably an edge case, but still
On Mar 7, 2016 08:06, "Ishan Chattopadhyaya (JIRA)"  wrote:

>
> [
> https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183190#comment-15183190
> ]
>
> Ishan Chattopadhyaya commented on SOLR-8740:
> 
>
> In the same breath, now that non-stored docValues fields can be accessed
> through the docValues api, I think the version field should be non-stored
> dv going forward. SOLR-6337
>
> > use docValues by default
> > 
> >
> > Key: SOLR-8740
> > URL: https://issues.apache.org/jira/browse/SOLR-8740
> > Project: Solr
> >  Issue Type: Improvement
> >Affects Versions: master
> >Reporter: Yonik Seeley
> > Fix For: master
> >
> >
> > We should consider switching to docValues for most of our non-text
> fields.  This may be a better default since it is more NRT friendly and
> acts to avoid OOM errors due to large field cache or UnInvertedField
> entries.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>


[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-07 Thread Ishan Chattopadhyaya (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183190#comment-15183190
 ] 

Ishan Chattopadhyaya commented on SOLR-8740:


In the same breath, now that non-stored docValues fields can be accessed 
through the docValues api, I think the version field should be non-stored dv 
going forward. SOLR-6337

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-07 Thread Varun Thacker (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183180#comment-15183180
 ] 

Varun Thacker commented on SOLR-8740:
-

+1 . Too many new users still don't know about docValues and when their index 
grows large and they start seeing memory pressure issues re-indexing is a big 
pain.



> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-07 Thread Jack Krupansky (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183177#comment-15183177
 ] 

Jack Krupansky commented on SOLR-8740:
--

And default to docValuesFormat="Memory" as well, or is that already the default 
when docValues="true" is set?

Personally, I still find the whole docValues vs. Stored fields narrative 
extremely confusing. I've never been able to figure out why Lucene still needs 
Stored fields (other than for tokenized text fields) if docValues is so much 
better.

In any case, with this Jira in place, there should be clear doc as to what 
scenarios, if any, stored="true" might have any utility for non-tokenized/text 
fields.

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-8740) use docValues by default

2016-03-07 Thread JIRA

[ 
https://issues.apache.org/jira/browse/SOLR-8740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183152#comment-15183152
 ] 

Tomás Fernández Löbbe commented on SOLR-8740:
-

+1

> use docValues by default
> 
>
> Key: SOLR-8740
> URL: https://issues.apache.org/jira/browse/SOLR-8740
> Project: Solr
>  Issue Type: Improvement
>Affects Versions: master
>Reporter: Yonik Seeley
> Fix For: master
>
>
> We should consider switching to docValues for most of our non-text fields.  
> This may be a better default since it is more NRT friendly and acts to avoid 
> OOM errors due to large field cache or UnInvertedField entries.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org