[jira] [Commented] (LUCENE-4272) another idea for updatable fields

Shai Erera (JIRA) Thu, 20 Dec 2012 13:49:16 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-4272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537408#comment-13537408
 ]


Shai Erera commented on LUCENE-4272:
------------------------------------

That remains to be seen. Storing entire documents (term vectors or not) is not 
going to scale either I think. Merging will just merge this data over and over 
.. unless you put it in another index or something. Sivan and I tried that 
(before 4258) in a project, it didn't perform so well. For every tiny update 
fetch the content from a stored field (yes we did #2 and #3, not just #3) 
simply didn't perform.

I think we're coming from different worlds. We may need to develop two 
different solutions for field updates, each is better for some scenarios.

Or hopefully, the approach on 4258 would prove performing enough, so we stick 
w/ just one approach.
                
> another idea for updatable fields
> ---------------------------------
>
>                 Key: LUCENE-4272
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4272
>             Project: Lucene - Core
>          Issue Type: New Feature
>            Reporter: Robert Muir
>
> I've been reviewing the ideas for updatable fields and have an alternative
> proposal that I think would address my biggest concern:
> * not slowing down searching
> When I look at what Solr and Elasticsearch do here, by basically reindexing 
> from stored fields, I think they solve a lot of the problem: users don't have 
> to "rebuild" their document from scratch just to update one tiny piece.
> But I think we can do this more efficiently: by avoiding reindexing of the 
> unaffected fields.
> The basic idea is that we would require term vectors for this approach (as 
> the already store a serialized indexed version of the doc), and so we could 
> just take the other pieces from the existing vectors for the doc.
> I think we would have to extend vectors to also store the norm (so we dont 
> recompute that), and payloads, but it seems feasible at a glance.
> I dont think we should discard the idea because vectors are slow/big today, 
> this seems like something we could fix.
> Personally I like the idea of not slowing down search performance to solve 
> the problem, I think we should really start from that angle and work towards 
> making the indexing side more efficient, not vice-versa.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4272) another idea for updatable fields

Reply via email to