Re: index size, stored vs indexed
Can't really be answered. For instance, stored data is held in *.fdt files and is largely irrelevant to searching since that data is only consulted for returning stored fields of the top N docs. So if your index consists of 90% stored data it's one answer, if 10% it's totally another. the stored data can be swapped in and out of the OS memory space ('cause it's MMapped) with vastly less impact on your system than other parts of the index. Certainly if you could fit it all in memory it'd be as fast as possible, whether enough faster to justify any extra cost is the question. Plus you'll want to understand how much data on your particular system is "too much" and take proactive actions when you approach that limit. So yeah, you'll have to test. Here's a long blog on the subject: https://lucidworks.com/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/. Skip to the section "Prototyping: how to get a handle on this problem" Best, Erick Best, Erick On Wed, Nov 14, 2018 at 7:28 AM David Hastings wrote: > > Was wondering if anyone has an idea of the ratio size of indexed only vs > stored and indexed in solr 7.x. I was gong to run some testing myself > later today but was curious what others have seen in this regard. > Thanks, > David
index size, stored vs indexed
Was wondering if anyone has an idea of the ratio size of indexed only vs stored and indexed in solr 7.x. I was gong to run some testing myself later today but was curious what others have seen in this regard. Thanks, David
Re: Solr UninvertingReader getNumericDocValues doesn't seem to work for fields that are not stored or indexed
Joel, Thank you for the reply! This approach solved my problem. Now should I be concerned about the 32 bits that are lost in converting the long to an int? Also, is this the intended approach when using NumericDocValues? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-UninvertingReader-getNumericDocValues-doesn-t-seem-to-work-for-fields-that-are-not-stored-or-ind-tp4251881p4252035.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr UninvertingReader getNumericDocValues doesn't seem to work for fields that are not stored or indexed
On Wed, Jan 20, 2016 at 10:19 AM, plbarrioswrote: > Joel, > > Thank you for the reply! > > This approach solved my problem. > > Now should I be concerned about the 32 bits that are lost in converting the > long to an int? Also, is this the intended approach when using > NumericDocValues? If the value was was originally a float, then no bits are lost, there were only 32 to begin with. -Yonik
Solr UninvertingReader getNumericDocValues doesn't seem to work for fields that are not stored or indexed
I have a defined field in my schema.xml that is a simple float that is not stored nor indexed as follows: ** In Solr 4 I was able to get the float value the following way: *FieldCache.Floats docBoosts = FieldCache.DEFAULT.getFloats(context.reader(), FieldName.Boost, false); float boost = docBoosts == null ? 1.0f : docBoosts.get(doc) return boost == 0.0f ? 1.0f : boost* In Solr 5 I tried to do this instead of FieldCache: *NumericDocValues docBoosts = DocValues.getNumeric(context.reader(), FieldName.Boost)* However this approach didn't work, as the UninvertingReader.getType fails to detect that this field is a float an instead provides me with a NumericDocValue from Lucene54DocValuesProducer that changes the value of*0.95f* to *1.06451437E9*. Has anyone found a similar issue, and if so could you tell me what am I missing? I have posted this to stackoverflow as well to the following link. http://stackoverflow.com/questions/34888150/solr-uninvertingreader-getnumericdocvalues-doesnt-seem-to-work-for-fields-that <http://stackoverflow.com/questions/34888150/solr-uninvertingreader-getnumericdocvalues-doesnt-seem-to-work-for-fields-that> -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-UninvertingReader-getNumericDocValues-doesn-t-seem-to-work-for-fields-that-are-not-stored-or-ind-tp4251881.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr UninvertingReader getNumericDocValues doesn't seem to work for fields that are not stored or indexed
Try converting the long to a float using: Float.intBitsToFloat((int)val). It should come back with the correct float. Joel Bernstein http://joelsolr.blogspot.com/ On Tue, Jan 19, 2016 at 5:35 PM, plbarrios <plbarr...@msn.com> wrote: > I have a defined field in my schema.xml that is a simple float that is not > stored nor indexed as follows: > > * stored="false" required="false" multiValued="false"/>* > > In Solr 4 I was able to get the float value the following way: > > *FieldCache.Floats docBoosts = > FieldCache.DEFAULT.getFloats(context.reader(), FieldName.Boost, false); > float boost = docBoosts == null ? 1.0f : docBoosts.get(doc) > return boost == 0.0f ? 1.0f : boost* > > In Solr 5 I tried to do this instead of FieldCache: > > *NumericDocValues docBoosts = DocValues.getNumeric(context.reader(), > FieldName.Boost)* > > However this approach didn't work, as the UninvertingReader.getType fails > to > detect that this field is a float an instead provides me with a > NumericDocValue from Lucene54DocValuesProducer that changes the value > of*0.95f* to *1.06451437E9*. > > Has anyone found a similar issue, and if so could you tell me what am I > missing? > > I have posted this to stackoverflow as well to the following link. > > > http://stackoverflow.com/questions/34888150/solr-uninvertingreader-getnumericdocvalues-doesnt-seem-to-work-for-fields-that > < > http://stackoverflow.com/questions/34888150/solr-uninvertingreader-getnumericdocvalues-doesnt-seem-to-work-for-fields-that > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-UninvertingReader-getNumericDocValues-doesn-t-seem-to-work-for-fields-that-are-not-stored-or-ind-tp4251881.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: query time join (stored or indexed value field?)
indexed for sure, and/or docValues. not stored for sure. On Mon, Jan 26, 2015 at 3:44 PM, Alvaro Cabrerizo topor...@gmail.com wrote: Hi, Is the time join query http://wiki.apache.org/solr/Join using stored data or indexed data from the fields set in from and to? (For example, the facet feature makes the count based on the indexed data) I've made an small example (using tokenizers, stopwords...) and it seems that the join uses the stored one, but I would be nice to confirm it. Regards. -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
query time join (stored or indexed value field?)
Hi, Is the time join query http://wiki.apache.org/solr/Join using stored data or indexed data from the fields set in from and to? (For example, the facet feature makes the count based on the indexed data) I've made an small example (using tokenizers, stopwords...) and it seems that the join uses the stored one, but I would be nice to confirm it. Regards.
RE: Stored or indexed?
Thanks for the great info! I appreciate everybody's help in getting started with Solr, hopefully I'll be able to get my stuff working and move on to more difficult questions. :) -Original Message- From: Elizabeth L. Murnane [mailto:emurn...@architexa.com] Sent: Friday, October 29, 2010 12:42 PM To: solr-user@lucene.apache.org Subject: Re: Stored or indexed? Hi Ron, In a nutshell - an indexed field is searchable, and a stored field has its content stored in the index so it is retrievable. Here are some examples that will hopefully give you a feel for how to set the indexed and stored options: indexed=true stored=true Use this for information you want to search on and also display in search results - for example, book title or author. indexed=false stored=true Use this for fields that you want displayed with search results but that don't need to be searchable - for example, destination URL, file system path, time stamp, or icon image. indexed=true stored=false Use this for fields you want to search on but don't need to get their values in search results. Here are some of the common reasons you would want this: Large fields and a database: Storing a field makes your index larger, so set stored to false when possible, especially for big fields. For this case a database is often used, as the previous responder said. Use a separate identifier field to get the field's content from the database. Ordering results: Say you define field name=bookName type=text indexed=true stored=true that is tokenized and used for searching. If you want to sort results based on book name, you could copy the field into a separate nonretrievable, nontokenized field that can be used just for sorting - field name=bookSort type=string indexed=true stored=false copyField source=bookName dest=bookSort Easier searching: If you define the field field name=text type=text indexed=true stored=false multiValued=true/ you can use it as a catch-all field that contains all of the other text fields. Since solr looks in a default field when given a text query without field names, you can support this type of general phrase query by making the catch-all the default field. indexed=false stored=false Use this when you want to ignore fields. For example, the following will ignore unknown fields that don't match a defined field rather than throwing an error by default. fieldtype name=ignored stored=false indexed=false dynamicField name=* type=ignored Elizabeth Murnane emurn...@architexa.com Architexa Lead Developer - www.architexa.com Understand Document Code In Seconds --- On Thu, 10/28/10, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: From: Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com Subject: Re: Stored or indexed? To: solr-user@lucene.apache.org Date: Thursday, October 28, 2010, 4:25 AM In our case, we just store a database id and do a secondary db query when displaying the results. This is handy and leads to a more centralised architecture when you need to display properties of a domain object which you don't index/search. On 28 October 2010 05:02, kenf_nc ken.fos...@realestate.com wrote: Interesting wiki link, I hadn't seen that table before. And to answer your specific question about indexed=true, stored=false, this is most often done when you are using analyzers/tokenizers on your field. This field is for search only, you would never retrieve it's contents for display. It may in fact be an amalgam of several fields into one 'content' field. You have your display copy stored in another field marked indexed=false, stored=true and optionally compressed. I also have simple string fields set to lowercase so searching is case-insensitive, and have a duplicate field where the string is normal case. the first one is indexed/not stored, the second is stored/not indexed. -- View this message in context: http://lucene.472066.n3.nabble.com/Stored-or-indexed-tp1782805p1784315.html Sent from the Solr - User mailing list archive at Nabble.com. DISCLAIMER: This electronic message, including any attachments, files or documents, is intended only for the addressee and may contain CONFIDENTIAL, PROPRIETARY or LEGALLY PRIVILEGED information. If you are not the intended recipient, you are hereby notified that any use, disclosure, copying or distribution of this message or any of the information included in or with it is unauthorized and strictly prohibited. If you have received this message in error, please notify the sender immediately by reply e-mail and permanently delete and destroy this message and its attachments, along with any copies thereof. This message does not create any contractual obligation on behalf of the sender or Law Bulletin Publishing Company. Thank you.
Re: Stored or indexed?
IMO, the very, very best way to increase your grasp of all things Solr is to try to answer questions on this list. Folks are pretty gentle about correcting mistaken posts. And I certainly remember any advice I've given that's been corrected G. Besides, if you try to answer the things you *do* understand, it leave more time for the committers to answer *your* questions G... Best Erick On Tue, Nov 2, 2010 at 4:39 PM, Olson, Ron rol...@lbpc.com wrote: Thanks for the great info! I appreciate everybody's help in getting started with Solr, hopefully I'll be able to get my stuff working and move on to more difficult questions. :) -Original Message- From: Elizabeth L. Murnane [mailto:emurn...@architexa.com] Sent: Friday, October 29, 2010 12:42 PM To: solr-user@lucene.apache.org Subject: Re: Stored or indexed? Hi Ron, In a nutshell - an indexed field is searchable, and a stored field has its content stored in the index so it is retrievable. Here are some examples that will hopefully give you a feel for how to set the indexed and stored options: indexed=true stored=true Use this for information you want to search on and also display in search results - for example, book title or author. indexed=false stored=true Use this for fields that you want displayed with search results but that don't need to be searchable - for example, destination URL, file system path, time stamp, or icon image. indexed=true stored=false Use this for fields you want to search on but don't need to get their values in search results. Here are some of the common reasons you would want this: Large fields and a database: Storing a field makes your index larger, so set stored to false when possible, especially for big fields. For this case a database is often used, as the previous responder said. Use a separate identifier field to get the field's content from the database. Ordering results: Say you define field name=bookName type=text indexed=true stored=true that is tokenized and used for searching. If you want to sort results based on book name, you could copy the field into a separate nonretrievable, nontokenized field that can be used just for sorting - field name=bookSort type=string indexed=true stored=false copyField source=bookName dest=bookSort Easier searching: If you define the field field name=text type=text indexed=true stored=false multiValued=true/ you can use it as a catch-all field that contains all of the other text fields. Since solr looks in a default field when given a text query without field names, you can support this type of general phrase query by making the catch-all the default field. indexed=false stored=false Use this when you want to ignore fields. For example, the following will ignore unknown fields that don't match a defined field rather than throwing an error by default. fieldtype name=ignored stored=false indexed=false dynamicField name=* type=ignored Elizabeth Murnane emurn...@architexa.com Architexa Lead Developer - www.architexa.com Understand Document Code In Seconds --- On Thu, 10/28/10, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: From: Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com Subject: Re: Stored or indexed? To: solr-user@lucene.apache.org Date: Thursday, October 28, 2010, 4:25 AM In our case, we just store a database id and do a secondary db query when displaying the results. This is handy and leads to a more centralised architecture when you need to display properties of a domain object which you don't index/search. On 28 October 2010 05:02, kenf_nc ken.fos...@realestate.com wrote: Interesting wiki link, I hadn't seen that table before. And to answer your specific question about indexed=true, stored=false, this is most often done when you are using analyzers/tokenizers on your field. This field is for search only, you would never retrieve it's contents for display. It may in fact be an amalgam of several fields into one 'content' field. You have your display copy stored in another field marked indexed=false, stored=true and optionally compressed. I also have simple string fields set to lowercase so searching is case-insensitive, and have a duplicate field where the string is normal case. the first one is indexed/not stored, the second is stored/not indexed. -- View this message in context: http://lucene.472066.n3.nabble.com/Stored-or-indexed-tp1782805p1784315.html Sent from the Solr - User mailing list archive at Nabble.com. DISCLAIMER: This electronic message, including any attachments, files or documents, is intended only for the addressee and may contain CONFIDENTIAL, PROPRIETARY or LEGALLY PRIVILEGED information. If you are not the intended recipient, you are hereby notified that any use, disclosure, copying or distribution of this message or any of the information included in or with it is unauthorized
Re: Stored or indexed?
Hi Ron, In a nutshell - an indexed field is searchable, and a stored field has its content stored in the index so it is retrievable. Here are some examples that will hopefully give you a feel for how to set the indexed and stored options: indexed=true stored=true Use this for information you want to search on and also display in search results - for example, book title or author. indexed=false stored=true Use this for fields that you want displayed with search results but that don't need to be searchable - for example, destination URL, file system path, time stamp, or icon image. indexed=true stored=false Use this for fields you want to search on but don't need to get their values in search results. Here are some of the common reasons you would want this: Large fields and a database: Storing a field makes your index larger, so set stored to false when possible, especially for big fields. For this case a database is often used, as the previous responder said. Use a separate identifier field to get the field's content from the database. Ordering results: Say you define field name=bookName type=text indexed=true stored=true that is tokenized and used for searching. If you want to sort results based on book name, you could copy the field into a separate nonretrievable, nontokenized field that can be used just for sorting - field name=bookSort type=string indexed=true stored=false copyField source=bookName dest=bookSort Easier searching: If you define the field field name=text type=text indexed=true stored=false multiValued=true/ you can use it as a catch-all field that contains all of the other text fields. Since solr looks in a default field when given a text query without field names, you can support this type of general phrase query by making the catch-all the default field. indexed=false stored=false Use this when you want to ignore fields. For example, the following will ignore unknown fields that don't match a defined field rather than throwing an error by default. fieldtype name=ignored stored=false indexed=false dynamicField name=* type=ignored Elizabeth Murnane emurn...@architexa.com Architexa Lead Developer - www.architexa.com Understand Document Code In Seconds --- On Thu, 10/28/10, Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com wrote: From: Savvas-Andreas Moysidis savvas.andreas.moysi...@googlemail.com Subject: Re: Stored or indexed? To: solr-user@lucene.apache.org Date: Thursday, October 28, 2010, 4:25 AM In our case, we just store a database id and do a secondary db query when displaying the results. This is handy and leads to a more centralised architecture when you need to display properties of a domain object which you don't index/search. On 28 October 2010 05:02, kenf_nc ken.fos...@realestate.com wrote: Interesting wiki link, I hadn't seen that table before. And to answer your specific question about indexed=true, stored=false, this is most often done when you are using analyzers/tokenizers on your field. This field is for search only, you would never retrieve it's contents for display. It may in fact be an amalgam of several fields into one 'content' field. You have your display copy stored in another field marked indexed=false, stored=true and optionally compressed. I also have simple string fields set to lowercase so searching is case-insensitive, and have a duplicate field where the string is normal case. the first one is indexed/not stored, the second is stored/not indexed. -- View this message in context: http://lucene.472066.n3.nabble.com/Stored-or-indexed-tp1782805p1784315.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stored or indexed?
In our case, we just store a database id and do a secondary db query when displaying the results. This is handy and leads to a more centralised architecture when you need to display properties of a domain object which you don't index/search. On 28 October 2010 05:02, kenf_nc ken.fos...@realestate.com wrote: Interesting wiki link, I hadn't seen that table before. And to answer your specific question about indexed=true, stored=false, this is most often done when you are using analyzers/tokenizers on your field. This field is for search only, you would never retrieve it's contents for display. It may in fact be an amalgam of several fields into one 'content' field. You have your display copy stored in another field marked indexed=false, stored=true and optionally compressed. I also have simple string fields set to lowercase so searching is case-insensitive, and have a duplicate field where the string is normal case. the first one is indexed/not stored, the second is stored/not indexed. -- View this message in context: http://lucene.472066.n3.nabble.com/Stored-or-indexed-tp1782805p1784315.html Sent from the Solr - User mailing list archive at Nabble.com.
Stored or indexed?
Hi all- I've read through the documentation, but I'm still a little confused about the field/ tag, in terms of the indexed and stored attributes. If I have something marked as indexed=true, why would I ever want stored=false? Are there any good tips-n-tricks anywhere about how to properly set the field tag? I've been finding bits and pieces both on the wiki and a couple of other websites, but there doesn't seem to be a good definitive how-to on this. Thanks for any info, Ron DISCLAIMER: This electronic message, including any attachments, files or documents, is intended only for the addressee and may contain CONFIDENTIAL, PROPRIETARY or LEGALLY PRIVILEGED information. If you are not the intended recipient, you are hereby notified that any use, disclosure, copying or distribution of this message or any of the information included in or with it is unauthorized and strictly prohibited. If you have received this message in error, please notify the sender immediately by reply e-mail and permanently delete and destroy this message and its attachments, along with any copies thereof. This message does not create any contractual obligation on behalf of the sender or Law Bulletin Publishing Company. Thank you.
Re: Stored or indexed?
http://wiki.apache.org/solr/FieldOptionsByUseCase] Hi all- I've read through the documentation, but I'm still a little confused about the field/ tag, in terms of the indexed and stored attributes. If I have something marked as indexed=true, why would I ever want stored=false? Are there any good tips-n-tricks anywhere about how to properly set the field tag? I've been finding bits and pieces both on the wiki and a couple of other websites, but there doesn't seem to be a good definitive how-to on this. Thanks for any info, Ron DISCLAIMER: This electronic message, including any attachments, files or documents, is intended only for the addressee and may contain CONFIDENTIAL, PROPRIETARY or LEGALLY PRIVILEGED information. If you are not the intended recipient, you are hereby notified that any use, disclosure, copying or distribution of this message or any of the information included in or with it is unauthorized and strictly prohibited. If you have received this message in error, please notify the sender immediately by reply e-mail and permanently delete and destroy this message and its attachments, along with any copies thereof. This message does not create any contractual obligation on behalf of the sender or Law Bulletin Publishing Company. Thank you.
Re: Stored or indexed?
Interesting wiki link, I hadn't seen that table before. And to answer your specific question about indexed=true, stored=false, this is most often done when you are using analyzers/tokenizers on your field. This field is for search only, you would never retrieve it's contents for display. It may in fact be an amalgam of several fields into one 'content' field. You have your display copy stored in another field marked indexed=false, stored=true and optionally compressed. I also have simple string fields set to lowercase so searching is case-insensitive, and have a duplicate field where the string is normal case. the first one is indexed/not stored, the second is stored/not indexed. -- View this message in context: http://lucene.472066.n3.nabble.com/Stored-or-indexed-tp1782805p1784315.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: stored and indexed in schema
Hi, Thank you for your reply, It look like I will have some benefit but I will also lose the highlighter/summary functionary, it that right? Thank you, Vinci Erik Hatcher wrote: On Mar 31, 2008, at 11:56 PM, Vinci wrote: I would like to ask, if I set a field to be indexed but not stored, I can retrieved the document but cannot retrieve this field? That's correct. By definition :) If I have large field that I want to index but I am not suppose to show them to user (The origin content stored in another processed document where I am using another field in Solr to point to their location...I throw the retrieval job to the server :P), will I get faster respond even the query doesn't ask solr to return this large field? You'll get better response in that Solr won't be taking the time to retrieve the large stored field, writing it to the response, and the client-side parsing that data, sure. Erik -- View this message in context: http://www.nabble.com/stored-and-indexed-in-schema-tp16411090p16419438.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: stored and indexed in schema
On Apr 1, 2008, at 1:35 PM, Vinci wrote: Thank you for your reply, It look like I will have some benefit but I will also lose the highlighter/summary functionary, it that right? Ya can't highlight what you don't have. So that's true. I think eventually it would be handy for Solr to allow the client to post in some text to be highlighted, but it does not currently support that. Erik Thank you, Vinci Erik Hatcher wrote: On Mar 31, 2008, at 11:56 PM, Vinci wrote: I would like to ask, if I set a field to be indexed but not stored, I can retrieved the document but cannot retrieve this field? That's correct. By definition :) If I have large field that I want to index but I am not suppose to show them to user (The origin content stored in another processed document where I am using another field in Solr to point to their location...I throw the retrieval job to the server :P), will I get faster respond even the query doesn't ask solr to return this large field? You'll get better response in that Solr won't be taking the time to retrieve the large stored field, writing it to the response, and the client-side parsing that data, sure. Erik -- View this message in context: http://www.nabble.com/stored-and- indexed-in-schema-tp16411090p16419438.html Sent from the Solr - User mailing list archive at Nabble.com.
stored and indexed in schema
Hi, I would like to ask, if I set a field to be indexed but not stored, I can retrieved the document but cannot retrieve this field? If I have large field that I want to index but I am not suppose to show them to user (The origin content stored in another processed document where I am using another field in Solr to point to their location...I throw the retrieval job to the server :P), will I get faster respond even the query doesn't ask solr to return this large field? Thank you, Vinci -- View this message in context: http://www.nabble.com/stored-and-indexed-in-schema-tp16411090p16411090.html Sent from the Solr - User mailing list archive at Nabble.com.