That's a cool patch. Thanks
On Thursday, July 3, 2014, Gopal Patwa <gopalpa...@gmail.com> wrote: > Thanks Ravi, it is good to know general problem with updatable field. In > our use-case where we have few fields which update more frequently then > main index. We are using this SOLR join contrib patch with DocTransformer > for returning data from join core. But this approach has some performance > impact if that performance hit acceptable for your use use-case then you > can give a try if you are using SOLR. > > https://issues.apache.org/jira/browse/SOLR-4787 > > > > > > On Thu, Jul 3, 2014 at 3:22 AM, Ravikumar Govindarajan < > ravikumar.govindara...@gmail.com <javascript:;>> wrote: > > > In case of sorting, updatable DocValues may be what you are looking for. > > > > But updatable fields for searching is a different beast. > > > > A sample approach is documented at > > > > > http://www.flax.co.uk/blog/2012/06/22/updating-individual-fields-in-lucene-with-a-redis-backed-codec/ > > > > The general problems with updatable postings-list AFAIK are > > > > 1. Impossible to correctly score updated documents > > 2. Segment Merges could miss out updates > > 3. Might behave in-correctly with NRT > > 4. Freq updates could end-up creating lots of files because of > append-only > > nature of lucene... > > > > May be if you are not too worried about scoring, correct NRT behavior etc > > you can attempt a solution like the RedisCodec stuff... > > > > Segregating static & dynamic fields into 2 separate indexes as described > > here > > > > > http://www.lucenerevolution.org/2013/Sidecar-Index-Solr-Components-for-Parallel-Index-Management > > may be of some use to you > > > > -- > > Ravi > > > > > > > > On Wed, Jul 2, 2014 at 7:29 PM, Shai Erera <ser...@gmail.com > <javascript:;>> wrote: > > > > > Using BinaryDocValues is not recommended for all scenarios. It is a > > > "catchall" alternative to the other DocValues types. I would not use it > > > unless it makes sense for your application, even if it means that you > > need > > > to re-index a document in order to update a single field. > > > > > > DocValues are not good for "search" - by search I assume you mean take > a > > > query such as "apache AND lucene" and find all documents which contain > > both > > > terms under the same field. They are good for sorting and faceting > > though. > > > > > > So I guess the answer to your question is "it depends" (it always is!) > - > > I > > > would use DocValues for sorting and faceting, but not for regular > search > > > queries. And I would use BinaryDocValues only when the other DocValues > > > types don't match. > > > > > > Also, note that the current field-level update of DocValues is not > always > > > better than re-indexing the document, you can read here for more > details: > > > > > > http://shaierera.blogspot.com/2014/04/benchmarking-updatable-docvalues.html > > > > > > Shai > > > > > > > > > On Tue, Jul 1, 2014 at 9:17 PM, Sandeep Khanzode < > > > sandeep_khanz...@yahoo.com.invalid> wrote: > > > > > > > Hi Shai, > > > > > > > > So one follow-up question. > > > > > > > > Assume that my use case is to have approx. ~50M documents indexed > with > > > > each document having about ~10-15 indexed but not stored fields. > These > > > > fields will never change, but there are another ~5-6 fields that will > > > > change and will continue to change after the index is written. These > > ~5-6 > > > > fields may also be multivalued. The size of this index turns out to > be > > > > ~120GB. > > > > > > > > In this case, I would like to sort or facet or search on these ~5-6 > > > > fields. Which approach do you suggest? Should I use BinaryDocValues > and > > > > update using IW or use either a ParallelReader/Join query. > > > > > > > > ----------------------- > > > > Thanks n Regards, > > > > Sandeep Ramesh Khanzode > > > > > > > > > > > > On Tuesday, July 1, 2014 9:53 PM, Shai Erera <ser...@gmail.com > <javascript:;>> wrote: > > > > > > > > > > > > > > > > Except that Lucene now offers efficient numeric and binary DocValues > > > > updates. See IndexWriter.updateNumeric/Binary... > > > > > > > > On Jul 1, 2014 5:51 PM, "Erick Erickson" <erickerick...@gmail.com > <javascript:;>> > > > wrote: > > > > > > > > > This JIRA is "complicated", don't really expect it in 4.9 as it's > > > > > been hanging around for quite a while. Everyone would like this, > > > > > but it's not easy. > > > > > > > > > > Atomic updates will work, but you have to stored="true" for all > > > > > source fields. Under the covers this actually reads the document > > > > > out of the stored fields, deletes the old one and adds it > > > > > over again. > > > > > > > > > > FWIW, > > > > > Erick > > > > > > > > > > On Tue, Jul 1, 2014 at 5:32 AM, Sandeep Khanzode > > > > > <sandeep_khanz...@yahoo.com.invalid> wrote: > > > > > > Hi, > > > > > > > > > > > > I wanted to know of the best approach to follow if a few fields > in > > my > > > > > indexed documents are changing at run time (after index and before > or > > > > > during search), but a majority of them are created at index time. > > > > > > > > > > > > I could see the JIRA given below but it is scheduled for Lucene > > 4.9, > > > I > > > > > believe. > > > > > > > > > > > > There are a few other approaches, like maintaining a separate > index > > > for > > > > > changing fields and use either a parallelreader or use a Join. > > > > > > > > > > > > Can everyone share their experience for this scenario on how it > is > > > > > handled in your systems? Thanks, > > > > > > > > > > > > [LUCENE-4258] Incremental Field Updates through Stacked Segments > - > > > ASF > > > > > JIRA > > > > > > > > > > > > > > > > > > [LUCENE-4258] Incremental Field Updates through Stacked > Segments - > > > ASF > > > > > JIRA > > > > > > Shai and I would like to start working on the proposal to > > Incremental > > > > > Field Updates outlined here ( > > > > http://markmail.org/message/zhrdxxpfk6qvdaex > > > > > ). > > > > > > View on issues.apache.org Preview by Yahoo > > > > > > > > > > > > > > > > > > ----------------------- > > > > > > Thanks n Regards, > > > > > > Sandeep Ramesh Khanzode > > > > > > > > > > > --------------------------------------------------------------------- > > > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > <javascript:;> > > > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > <javascript:;> > > > > > > > > > > > > > > > >