[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13610472#comment-13610472 ] Commit Tag Bot commented on LUCENE-4575: [branch_4x commit] Shai Erera http://svn.apache.org/viewvc?view=revision&revision=1416367 LUCENE-4575: add IndexWriter.setCommitData > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Assignee: Shai Erera >Priority: Minor > Fix For: 4.1, 5.0 > > Attachments: LUCENE-4575.patch, LUCENE-4575.patch, LUCENE-4575.patch, > LUCENE-4575-testcase.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508551#comment-13508551 ] Commit Tag Bot commented on LUCENE-4575: [branch_4x commit] Shai Erera http://svn.apache.org/viewvc?view=revision&revision=1416367 LUCENE-4575: add IndexWriter.setCommitData > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Assignee: Shai Erera >Priority: Minor > Fix For: 4.1, 5.0 > > Attachments: LUCENE-4575.patch, LUCENE-4575.patch, LUCENE-4575.patch, > LUCENE-4575-testcase.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508524#comment-13508524 ] Commit Tag Bot commented on LUCENE-4575: [trunk commit] Shai Erera http://svn.apache.org/viewvc?view=revision&revision=1416361 LUCENE-4575: add IndexWriter.setCommitData > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch, LUCENE-4575.patch, > LUCENE-4575-testcase.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508375#comment-13508375 ] Michael McCandless commented on LUCENE-4575: +1, looks great. Thanks Shai! > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch, LUCENE-4575.patch, > LUCENE-4575-testcase.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508363#comment-13508363 ] Shai Erera commented on LUCENE-4575: I don't think that we should add more work to finishCommit() either. Being able to setCommitData after prep() is just a bonus. It didn't work before, and it will continue to not work now. And I can't think of a good usecase for why an app would not be able to set commitData prior to prep(). If it comes up, we can discuss a solution again. At least we know that moving commitData write to finishCommit will solve it. I'll make sure the test exposes the bug you reported in IW.finishCommit(). > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch, > LUCENE-4575-testcase.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508356#comment-13508356 ] Michael McCandless commented on LUCENE-4575: bq. Hmmm ... setting the commitData on pendingCommit cannot work, b/c the commitData is written to segnOutput on prepareCommit(). Oh yeah ... I forgot about that :) Hmm ... I don't think we should move writing the commit data to finishCommit? Is it really so hard for the app to provide the commit data before calling prepareCommit? > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch, > LUCENE-4575-testcase.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508327#comment-13508327 ] Shai Erera commented on LUCENE-4575: Hmmm ... setting the commitData on pendingCommit cannot work, b/c the commitData is written to segnOutput on prepareCommit(). Following commit() merely calls infos.finishCommit() which writes the checksum and closes the output. Can we modify segmentInfos.write() to not write the commitData, but move it to finishCommit()? Not sure that I like this approach, because it means that finishCommit() will do slightly more work, which increases the chance of getting an IOException during commit() after prepareCommit() successfully returned, but on the other hand it's the gains might be worth it? Being able to write commitData after you know all your document additions/deletions/updates are 'safe' might prove valuable. And finishCommit() already does I/O, writing checksum ... What do you think? > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch, > LUCENE-4575-testcase.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508319#comment-13508319 ] Shai Erera commented on LUCENE-4575: The test isn't exactly accurate, because it tests a scenario that is currently not supported. I.e., after calling prepareCommit(), nothing that you do on IW will be committed. Rather, to expose the bug it should be modified as follows: {code} iw.setCommitData(data1); iw.prepareCommit(); iw.setCommitData(data2); // that will be ignored by follow-on commit iw.commit(); checkCommitData(); // will see data1 iw.commit(); // that 'should' commit data2 checkCommitData(); // that will see data1 again, because of the copy that happens in finishCommit() {code} I'll modify the test like so and include it in my next patch. > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch, > LUCENE-4575-testcase.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508275#comment-13508275 ] Michael McCandless commented on LUCENE-4575: I'll make a test exposing the bug ... > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508268#comment-13508268 ] Shai Erera commented on LUCENE-4575: I'll make the changes, and also it seems like you were suggesting that earlier -- allow setCommitData to affect the pendingCommit too. I think that's valuable because you can e.g. call prerCommit() -> setCommitData() -> commit() -- the setCD() in the middle lets you create a commitData that will pertain to the state of the index after the commit. I'll make all the changes and post a new patch, probably tomorrow. > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508267#comment-13508267 ] Michael McCandless commented on LUCENE-4575: {quote} I agree that calling that setCommitData in finishCommit is redundant, but perhaps we can solve it more elegantly by either: # Not storing the setCommitData in infos, but rather in a private IW member. Then in startCommit set it on the cloned infos. It's essentially how it's done today, only now the commit data will be copied from a member. # Stick w/ current API commit(commitData) and prepareCommit(commitData), and just make sure that commit goes through even if changeCount == previousChangeCount, but commitUserData != null. {quote} Hmm, I'd rather not store the member inside IW *and* inside SIS; just seems safer to have a single clear place where this is tracked. Also, I like the new API so I'd rather not do #2? I think just removing that line in finishCommit should fix the bug ... but first we need a test exposing it. bq. I think that the code in finishCommit ensures that we can always pull the commitData from segmentInfos? Yes. > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508254#comment-13508254 ] Shai Erera commented on LUCENE-4575: bq. I thought we were going to rename ensureOpen's confusing boolean param? Right, but for some reason I thought that you're going to do that :). I'll do it in the next patch. bq. IW.setCommitData should be sync'd I think, eg to ensure visibility across threads of the changes to sis.userData? Ok bq. Hmm ... I think there's a thread hazard here, during commit I think you're right. Not sure how practical, because I believe that usually the commit thread will also be the one that calls setCommitData, but it is possible. I agree that calling that setCommitData in finishCommit is redundant, but perhaps we can solve it more elegantly by either: # Not storing the setCommitData in infos, but rather in a private IW member. Then in startCommit set it on the cloned infos. It's essentially how it's done today, only now the commit data will be copied from a member. # Stick w/ current API commit(commitData) and prepareCommit(commitData), and just make sure that commit goes through even if changeCount == previousChangeCount, but commitUserData != null. Option #2 means that there's no API break, no synchronization is needed on setCommitData and practically everything remains the same. We can still remove the redundant .setCommitData in finishCommit regadless. bq. should we add an IW.getCommitData? I think that that'd be great ! Today the only way to do it is if you refresh a reader (expensive). I think that the code in finishCommit ensures that we can always pull the commitData from segmentInfos? > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508251#comment-13508251 ] Michael McCandless commented on LUCENE-4575: Actually I think we should just remove that .setUserData inside finishCommit? Also, should we add an IW.getCommitData? > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508245#comment-13508245 ] Michael McCandless commented on LUCENE-4575: I thought we were going to rename ensureOpen's confusing boolean param? IW.setCommitData should be sync'd I think, eg to ensure visibility across threads of the changes to sis.userData? Hmm ... I think there's a thread hazard here, during commit; I think if pendingCommit is not null you should also call pendingCommit.setUserData? Else, a commit can finish and "undo" the user's change to the commit data (see finishCommit, where it calls .setUserData). Maybe we need a thread safety test here ... > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch, LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506540#comment-13506540 ] Shai Erera commented on LUCENE-4575: bq. Then the only thing that would need to change is DataInput.readStringStringMap: s/HashMap/LinkedHashMap. So you propose that the code will always initialize LHM in DataInput, that way preserving order whether required or not? Yes, I guess that we can do that. But I wonder if we should? We didn't so far, and nobody complained. And since it's an internal change, we can always make that change in the future if somebody asks? > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506530#comment-13506530 ] Yonik Seeley commented on LUCENE-4575: -- bq. I don't think though that it's trivial to support. Currently the user can pass any Map, but IndexReader returns in practice a HashMap (DataInput.readStringStringMap initializes a HashMap). If a user cared about order, then they would pass a LinkedHashMap. Then the only thing that would need to change is DataInput.readStringStringMap: s/HashMap/LinkedHashMap. bq. it's not really related to how the map is set. It is... if you make a copy of the map and we want to preserve order, it's new LinkedHashMap instead of HashMap. It's a minor enough point I don't think it does deserve it's own issue. I don't personally care about preserving order - but I did think it was worth at least bringing up. > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506528#comment-13506528 ] Uwe Schindler commented on LUCENE-4575: --- The API returns Map, so we make no garanties about order. > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506526#comment-13506526 ] Shai Erera commented on LUCENE-4575: We use commitData extensively but we don't care about the order. We store key/value pairs. I don't think though that it's trivial to support. Currently the user can pass any Map, but IndexReader returns in practice a HashMap (DataInput.readStringStringMap initializes a HashMap). Therefore, if we want to preserve the type of the Map, we'd need to change DataInput/Output code. I'm not sure it's worth the hassle, but let's discuss that anyway on a separate issue? It's not really related to how the map is set. > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506492#comment-13506492 ] Yonik Seeley commented on LUCENE-4575: -- bq. I currently copy the commitData map on setCommitData. It seems safe to do it, and I don't think commitData are huge. Any objections? Do any users care about order (i.e. they pass in a LinkedHashMap)? If would be trivial to preserve *if* it added value for some. > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506468#comment-13506468 ] Shai Erera commented on LUCENE-4575: Thanks. I forgot to mention two things about the changes in the patch, which I wasn't sure about: # I currently copy the commitData map on setCommitData. It seems safe to do it, and I don't think commitData are huge. Any objections? # I set pass the copied map directly to segmentInfos, rather than saving it in a member in IW. Do you see any issues with it? (I'm thinking about rollback, even though we have another copy of the segmentInfos for rollback purposes ...) > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-4575) Allow IndexWriter to commit, even just commitData
[ https://issues.apache.org/jira/browse/LUCENE-4575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506432#comment-13506432 ] Michael McCandless commented on LUCENE-4575: +1 to do a hard break; this is expert. > Allow IndexWriter to commit, even just commitData > - > > Key: LUCENE-4575 > URL: https://issues.apache.org/jira/browse/LUCENE-4575 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index >Reporter: Shai Erera >Priority: Minor > Attachments: LUCENE-4575.patch > > > Spinoff from here > http://lucene.472066.n3.nabble.com/commit-with-only-commitData-td4022155.html. > In some cases, it is valuable to be able to commit changes to the index, even > if the changes are just commitData. Such data is sometimes used by > applications to register in the index some global application > information/state. > The proposal is: > * Add a setCommitData() API and separate it from commit() and prepareCommit() > (simplify their API) > * When that API is called, flip on the dirty/changes bit, so that this gets > committed even if no other changes were made to the index. > I will work on a patch a post. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org