Warning: This is from a Lucene perspective.... I don't think it matters. I'm pretty sure that COMPRESS onlyapplies to *storing* the data, not putting the tokens in the index (this latter is what's serached)...
It *will* cause performance issues if you load that field for a large number of documents on a particular search. I know Lucene itself has lazy field loading that helps in this case, but I don't know how to persuade SOLR to use it (it may even lazy-load automatically). But this is separate from searching... Best er...@nottoomuchhelpbutimtrying. On Thu, Jun 4, 2009 at 4:07 AM, Fer-Bj <fernando.b...@gmail.com> wrote: > > Is it correct to assume that using field compression will cause performance > issues if we decide to allow search over this field? > > ie: > > <field name="id" type="sint" indexed="true" stored="true" > required="true" /> > <field name="title" type="text" indexed="true" stored="true" > omitNorms="true"/> > <field name="file_location" type="string" indexed="false" > stored="true"/> > <field name="body" type="text" indexed="true" stored="false" > omitNorms="true"/> > > if I decide to add "compressed=true" to the BODY field... and a I allow > search on body... would that be a problem? > At the same time: if I add compressed=true , but I never do search on this > field.... ? > > > Stu Hood-3 wrote: > > > > I just finished watching this talk about a column-store RDBMS, which has > a > > long section on column compression. Specifically, it talks about the > gains > > from compressing similar data together, and how lazily decompressing data > > only when it must be processed is great for memory/CPU cache usage. > > > > http://youtube.com/watch?v=yrLd-3lnZ58 > > > > While interesting, its not relevant to Lucene's stored field storage. On > > the other hand, it did get me thinking about stored field compression and > > lazy field loading. > > > > Can anyone give me some pointers about compressThreshold values that > would > > be worth experimenting with? Our stored fields are often between 20 and > > 300 characters, and we're willing to spend more time indexing if it will > > make searching less IO bound. > > > > Thanks, > > > > Stu Hood > > Architecture Software Developer > > Mailtrust, a Rackspace Company > > > > > > > > -- > View this message in context: > http://www.nabble.com/Field-Compression-tp15258669p23865558.html > Sent from the Solr - User mailing list archive at Nabble.com. > >