[jira] [Comment Edited] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223986#comment-17223986 ] Zach Chen edited comment on LUCENE-8982 at 10/31/20, 3:21 AM: -- Thanks Michael! I've opened an initial PR with the proposed changes, but also have a few follow-up questions: # Are new unit tests (other than benchmarking tests) needed, given these changes may not have direct side-effects to assert on? # Shall I go ahead and remove code that may no longer be used after these changes, such as the related method in *NativePosixUtil.cpp*? The PR also needs to include some more updates to comment to reflect the new way of doing direct IO. I can work on that once the changes are very much ready. was (Author: zacharymorn): Thanks Michael! I've opened an initial PR with the proposed changes, but also have a few follow-up questions: # Are new unit tests (other than benchmarking tests) needed, given these changes may not have direct side-effects to assert on? # Shall I go ahead and remove code that may no longer be used after these changes, such as the related method in `NativePosixUtil.cpp`? The PR also needs to include some more updates to comment to reflect the new way of doing direct IO. I can work on that once the changes are very much ready. > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Priority: Major > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223986#comment-17223986 ] Zach Chen commented on LUCENE-8982: --- Thanks Michael! I've opened an initial PR with the proposed changes, but also have a few follow-up questions: # Are new unit tests (other than benchmarking tests) needed, given these changes may not have direct side-effects to assert on? # Shall I go ahead and remove code that may no longer be used after these changes, such as the related method in `NativePosixUtil.cpp`? The PR also needs to include some more updates to comment to reflect the new way of doing direct IO. I can work on that once the changes are very much ready. > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Priority: Major > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14975) Optimize CoreContainer.getAllCoreNames and getLoadedCoreNames
[ https://issues.apache.org/jira/browse/SOLR-14975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223975#comment-17223975 ] Erick Erickson commented on SOLR-14975: --- Yeah, these look like they're over-used. I'm not certain though that checking getCoreDescriptor is sufficient in all cases. SolrCores.getAllCoreNames for instance gets names from SolrCores, the TransientCacheHandler _and_ residentDescriptors. Can we guarantee that the names in residentDescriptors is a superset of cores.keySet and similarly for getTransientCacheHandler().getCoreDescriptor .vs. getTransientCacheHandler().getAllCorenames()? BTW, SolrCores.getCoreDescriptors() (note the 's') also has the issue, although it may not be called very often. I certainly do remember that getting that all to work when I made transientCacheHandler pluggable was a bear. That said, the code has evolved from that long-ago date. I'm not defending the current implementation, just sayin' that it may be...less pleasant to change than it appears at first glance. I'm not sure queries block on these. CoreContainer.getCore(some_core) doesn't look like it calls these on a quick glance. Is there someplace else they're called for queries? NOTE: it's Friday night so I'm not too sharp and it may be obvious. All that said, assembling these lists is expensive and anything we can do to cut down on calling them is a fine idea. They may well be being used because they were a pre-existing call without understanding the costs. I do wonder if that's the case there's some mechanism to keep them from being used inappropriately in the future. Of course if whoever picks this up is really ambitious, much of this could be re-thought... > Optimize CoreContainer.getAllCoreNames and getLoadedCoreNames > -- > > Key: SOLR-14975 > URL: https://issues.apache.org/jira/browse/SOLR-14975 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: David Smiley >Priority: Major > > The methods CoreContainer.getAllCoreNames and getLoadedCoreNames hold a lock > while they grab core names to put into a TreeSet. When there are *many* > cores, this delay is noticeable. Holding this lock effectively blocks > queries since queries lookup a core; so it's critically important that these > methods are *fast*. The tragedy here is that some callers merely want to > know if a particular name is in the set, or what the aggregated size is. > Some callers want to iterate the names but don't really care what the > iteration order is. > I propose that some callers of these two methods find suitable alternatives, > like getCoreDescriptor to check for null. And I propose that these methods > return a HashSet -- no order. If the caller wants it sorted, it can do so > itself. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14975) Optimize CoreContainer.getAllCoreNames and getLoadedCoreNames
David Smiley created SOLR-14975: --- Summary: Optimize CoreContainer.getAllCoreNames and getLoadedCoreNames Key: SOLR-14975 URL: https://issues.apache.org/jira/browse/SOLR-14975 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: David Smiley The methods CoreContainer.getAllCoreNames and getLoadedCoreNames hold a lock while they grab core names to put into a TreeSet. When there are *many* cores, this delay is noticeable. Holding this lock effectively blocks queries since queries lookup a core; so it's critically important that these methods are *fast*. The tragedy here is that some callers merely want to know if a particular name is in the set, or what the aggregated size is. Some callers want to iterate the names but don't really care what the iteration order is. I propose that some callers of these two methods find suitable alternatives, like getCoreDescriptor to check for null. And I propose that these methods return a HashSet -- no order. If the caller wants it sorted, it can do so itself. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9594) Linear function for FeatureField
[ https://issues.apache.org/jira/browse/LUCENE-9594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223890#comment-17223890 ] Mayya Sharipova commented on LUCENE-9594: - PR: https://github.com/apache/lucene-solr/pull/2051 > Linear function for FeatureField > > > Key: LUCENE-9594 > URL: https://issues.apache.org/jira/browse/LUCENE-9594 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Mayya Sharipova >Priority: Minor > > Currently FeatureField supports only 3 functions: log, saturation and > sigmoid. > It is useful for certain cases to have a linear function. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9594) Linear function for FeatureField
Mayya Sharipova created LUCENE-9594: --- Summary: Linear function for FeatureField Key: LUCENE-9594 URL: https://issues.apache.org/jira/browse/LUCENE-9594 Project: Lucene - Core Issue Type: Improvement Reporter: Mayya Sharipova Currently FeatureField supports only 3 functions: log, saturation and sigmoid. It is useful for certain cases to have a linear function. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14974) Modernize and clean up document clustering contrib (8.x backport)
[ https://issues.apache.org/jira/browse/SOLR-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved SOLR-14974. Resolution: Won't Fix Can't apply the patch due to 8.x line being permanently on Java 1.8. > Modernize and clean up document clustering contrib (8.x backport) > - > > Key: SOLR-14974 > URL: https://issues.apache.org/jira/browse/SOLR-14974 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > Fix For: 8.8 > > > Backported code is at: > https://github.com/dweiss/lucene-solr/tree/SOLR-14974 > Will only compile and pass tests under Java 11. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (LUCENE-9593) Update java docs in index/TestBackwardsCompatibility class to use gradle and not ant
Gautam Worah created LUCENE-9593: Summary: Update java docs in index/TestBackwardsCompatibility class to use gradle and not ant Key: LUCENE-9593 URL: https://issues.apache.org/jira/browse/LUCENE-9593 Project: Lucene - Core Issue Type: Improvement Components: core/index Affects Versions: 8.6.3 Reporter: Gautam Worah In [PR-1733|https://github.com/apache/lucene-solr/pull/1733] we noticed that the comments for generating old indexes in {{org.apache.lucene.index.TestBackwardsCompatibility}} use ant instead of gradle. However, since support for ant has been removed we should update the docs to reflect the new command. New tests in the same [PR-1733|https://github.com/apache/lucene-solr/pull/1733] suggest that the change in the comments is simple: Replace `ant test` by `gradlew test`. (with no change in the JVM args) I've verified that this works for the tests in [PR-1733|https://github.com/apache/lucene-solr/pull/1733] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14971) AtomicUpdate 'remove' fails on 'pints' fields
[ https://issues.apache.org/jira/browse/SOLR-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223746#comment-17223746 ] Jason Gerlowski commented on SOLR-14971: It turns out this is a type problem, but not the one I guessed above. The "existing document" values aren't Strings - they're Longs! The code in question is the {{removeObj}} method here: {code} private void removeObj(@SuppressWarnings({"rawtypes"})Collection original, Object toRemove, String fieldName) { if(isChildDoc(toRemove)) { removeChildDoc(original, (SolrInputDocument) toRemove); } else { original.remove(getNativeFieldValue(fieldName, toRemove)); } } {code} When removing an int value, {{getNativeFieldValue}} is consistent in always returning an integer. But the 'original' Collection of existing values has different types depending on whether it was retrieved from the update log (longs) or from the index (ints). Java's Number classes declare {{equals()}} in such a way that a Long is never equal to an Integer, even when the two represent the same numeric quantity. So a type mismatch causes the {{remove}} attempt to fail when the existing doc is retrieved from the tlog. This cause has one upside - while it's like to affect int/long and double/float, it seems specific to numerics and doesn't translate to other types. There's a couple ways we could address this: # Make sure the tlog's SolrInputDocument has int values instead of longs where appropriate. # Special-case numeric values in {{AtomicUpdateDocumentMerger.removeObj}} and add custom removal logic that handles the inconsistency in our input types. Conceptually I like (1) much better - it'd be nice if the atomic-update code (and anything else that pulls RTG values) didn't have to handle a bunch of arbitrary variations in its input. But this is could be a big change and might run afoul of whatever legitimate reasons there might be for storing the values as Longs in the UpdateLog. Anyway, I'll probably go with the admittedly-hackier custom logic approach, despite not liking it. If anyone sees a better way forward though, please let me know. > AtomicUpdate 'remove' fails on 'pints' fields > - > > Key: SOLR-14971 > URL: https://issues.apache.org/jira/browse/SOLR-14971 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 8.5.2 >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Attachments: reproduce.sh > > > The "remove" atomic update action on multivalued int fields fails if the > document being changed is uncommitted. > At first glance this appears to be a type-related issue. > AtomicUpdateDocumentMerger attempts to handle multivalued int fields by > processing the List type, but in uncommitted docs int fields are > still List in the tlog. Conceptually this feels similar to > SOLR-13331. > It's likely this issue also affects other numeric and date fields. > Attached is a simple script to reproduce, meant to be run from the root of a > Solr install. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Assigned] (SOLR-14971) AtomicUpdate 'remove' fails on 'pints' fields
[ https://issues.apache.org/jira/browse/SOLR-14971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Gerlowski reassigned SOLR-14971: -- Assignee: Jason Gerlowski > AtomicUpdate 'remove' fails on 'pints' fields > - > > Key: SOLR-14971 > URL: https://issues.apache.org/jira/browse/SOLR-14971 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: update >Affects Versions: 8.5.2 >Reporter: Jason Gerlowski >Assignee: Jason Gerlowski >Priority: Major > Attachments: reproduce.sh > > > The "remove" atomic update action on multivalued int fields fails if the > document being changed is uncommitted. > At first glance this appears to be a type-related issue. > AtomicUpdateDocumentMerger attempts to handle multivalued int fields by > processing the List type, but in uncommitted docs int fields are > still List in the tlog. Conceptually this feels similar to > SOLR-13331. > It's likely this issue also affects other numeric and date fields. > Attached is a simple script to reproduce, meant to be run from the root of a > Solr install. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9583) How should we expose VectorValues.RandomAccess?
[ https://issues.apache.org/jira/browse/LUCENE-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223706#comment-17223706 ] Jim Ferenczi commented on LUCENE-9583: -- By "wrong message" I mean that we require two implementations where only one is needed. It will be difficult to optimize one type of access without hurting the other so I'd lean toward a single pattern. If it's random access so be it but the pros/cons should be considered carefully. The forward iterator design is constraining but it also forces to think about how to access the data efficiently. > Yes, think of parent/child index with vectors only on the parent I see these ordinals as an optimization detail of how the graph stores things. I don't think they should be exposed at all since the user should interact with doc ids directly. It's something that could come later if needed but that sounds like a complexity that we could avoid when introducing a new format. We don't need to optimize for the sparse case, at least not yet ;). > How should we expose VectorValues.RandomAccess? > --- > > Key: LUCENE-9583 > URL: https://issues.apache.org/jira/browse/LUCENE-9583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael Sokolov >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In the newly-added {{VectorValues}} API, we have a {{RandomAccess}} > sub-interface. [~jtibshirani] pointed out this is not needed by some > vector-indexing strategies which can operate solely using a forward-iterator > (it is needed by HNSW), and so in the interest of simplifying the public API > we should not expose this internal detail (which by the way surfaces internal > ordinals that are somewhat uninteresting outside the random access API). > I looked into how to move this inside the HNSW-specific code and remembered > that we do also currently make use of the RA API when merging vector fields > over sorted indexes. Without it, we would need to load all vectors into RAM > while flushing/merging, as we currently do in > {{BinaryDocValuesWriter.BinaryDVs}}. I wonder if it's worth paying this cost > for the simpler API. > Another thing I noticed while reviewing this is that I moved the KNN > {{search(float[] target, int topK, int fanout)}} method from {{VectorValues}} > to {{VectorValues.RandomAccess}}. This I think we could move back, and > handle the HNSW requirements for search elsewhere. I wonder if that would > alleviate the major concern here? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14955) Add env var options for the Prometheus Exporter bin scripts
[ https://issues.apache.org/jira/browse/SOLR-14955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Houston Putman resolved SOLR-14955. --- Fix Version/s: 8.8 Resolution: Fixed > Add env var options for the Prometheus Exporter bin scripts > --- > > Key: SOLR-14955 > URL: https://issues.apache.org/jira/browse/SOLR-14955 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - prometheus-exporter >Reporter: Houston Putman >Assignee: Houston Putman >Priority: Major > Fix For: 8.8 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > The prometheus exporter bin scripts get the job done, but are quite lean and > don't provide a large amount of help in setting up the exporter. > A list of things that could be improved: > * Allowing users to pass [command line > options|https://lucene.apache.org/solr/guide/8_6/monitoring-solr-with-prometheus-and-grafana.html#command-line-parameters] > through environment variables > * Support the ZK ACL environment variables > * Similar memory options to solr/bin/solr > These are just a few ideas, but a little work would go a long way here. > Previous discussion: > * [docker-solr#307|https://github.com/docker-solr/docker-solr/issues/307] > * [docker-solr#224|https://github.com/docker-solr/docker-solr/pull/224] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14955) Add env var options for the Prometheus Exporter bin scripts
[ https://issues.apache.org/jira/browse/SOLR-14955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223700#comment-17223700 ] ASF subversion and git services commented on SOLR-14955: Commit 5b79ad3d64638556b04528bd941be3e41fc66a3a in lucene-solr's branch refs/heads/branch_8x from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=5b79ad3 ] SOLR-14955: Add env var options to Prometheus Export scripts. (#2038) > Add env var options for the Prometheus Exporter bin scripts > --- > > Key: SOLR-14955 > URL: https://issues.apache.org/jira/browse/SOLR-14955 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - prometheus-exporter >Reporter: Houston Putman >Assignee: Houston Putman >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > The prometheus exporter bin scripts get the job done, but are quite lean and > don't provide a large amount of help in setting up the exporter. > A list of things that could be improved: > * Allowing users to pass [command line > options|https://lucene.apache.org/solr/guide/8_6/monitoring-solr-with-prometheus-and-grafana.html#command-line-parameters] > through environment variables > * Support the ZK ACL environment variables > * Similar memory options to solr/bin/solr > These are just a few ideas, but a little work would go a long way here. > Previous discussion: > * [docker-solr#307|https://github.com/docker-solr/docker-solr/issues/307] > * [docker-solr#224|https://github.com/docker-solr/docker-solr/pull/224] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14955) Add env var options for the Prometheus Exporter bin scripts
[ https://issues.apache.org/jira/browse/SOLR-14955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223699#comment-17223699 ] ASF subversion and git services commented on SOLR-14955: Commit e1698bda95dcd76e1d0bdf351c3672d68478eb80 in lucene-solr's branch refs/heads/master from Houston Putman [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e1698bd ] SOLR-14955: Add env var options to Prometheus Export scripts. (#2038) > Add env var options for the Prometheus Exporter bin scripts > --- > > Key: SOLR-14955 > URL: https://issues.apache.org/jira/browse/SOLR-14955 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - prometheus-exporter >Reporter: Houston Putman >Assignee: Houston Putman >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > The prometheus exporter bin scripts get the job done, but are quite lean and > don't provide a large amount of help in setting up the exporter. > A list of things that could be improved: > * Allowing users to pass [command line > options|https://lucene.apache.org/solr/guide/8_6/monitoring-solr-with-prometheus-and-grafana.html#command-line-parameters] > through environment variables > * Support the ZK ACL environment variables > * Similar memory options to solr/bin/solr > These are just a few ideas, but a little work would go a long way here. > Previous discussion: > * [docker-solr#307|https://github.com/docker-solr/docker-solr/issues/307] > * [docker-solr#224|https://github.com/docker-solr/docker-solr/pull/224] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8982) Make NativeUnixDirectory pure java now that direct IO is possible
[ https://issues.apache.org/jira/browse/LUCENE-8982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223693#comment-17223693 ] Michael McCandless commented on LUCENE-8982: Yes please! Feel free to tackle this! I can help w/ benchmarking. > Make NativeUnixDirectory pure java now that direct IO is possible > - > > Key: LUCENE-8982 > URL: https://issues.apache.org/jira/browse/LUCENE-8982 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/misc >Reporter: Michael McCandless >Priority: Major > > {{NativeUnixDirectory}} is a {{Directory}} implementation that uses direct IO > to write newly merged segments. Direct IO bypasses the kernel's buffer cache > and write cache, making merge writes "invisible" to the kernel, though the > reads for merging the N segments are still going through the kernel. > But today, {{NativeUnixDirectory}} uses a small JNI wrapper to access the > {{O_DIRECT}} flag to {{open}} ... since JDK9 we can now pass that flag in > pure java code, so we should now fix {{NativeUnixDirectory}} to not use JNI > anymore. > We should also run some more realistic benchmarks seeing if this option > really helps nodes that are doing concurrent indexing (merging) and searching. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-9583) How should we expose VectorValues.RandomAccess?
[ https://issues.apache.org/jira/browse/LUCENE-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223684#comment-17223684 ] Michael Sokolov commented on LUCENE-9583: - > Ok so we need dense ordinals because you expect to have lots of documents > without a value ? Yes, think of parent/child index with vectors only on the parent > If random access is the way to go then we don't need the forward iterator but > I agree with Julie that it maybe send the wrong message which is why I > proposed to add the reset method. It's true we may not need a forward iterator. I'm not sure what you mean by the "wrong message" though. > How should we expose VectorValues.RandomAccess? > --- > > Key: LUCENE-9583 > URL: https://issues.apache.org/jira/browse/LUCENE-9583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael Sokolov >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In the newly-added {{VectorValues}} API, we have a {{RandomAccess}} > sub-interface. [~jtibshirani] pointed out this is not needed by some > vector-indexing strategies which can operate solely using a forward-iterator > (it is needed by HNSW), and so in the interest of simplifying the public API > we should not expose this internal detail (which by the way surfaces internal > ordinals that are somewhat uninteresting outside the random access API). > I looked into how to move this inside the HNSW-specific code and remembered > that we do also currently make use of the RA API when merging vector fields > over sorted indexes. Without it, we would need to load all vectors into RAM > while flushing/merging, as we currently do in > {{BinaryDocValuesWriter.BinaryDVs}}. I wonder if it's worth paying this cost > for the simpler API. > Another thing I noticed while reviewing this is that I moved the KNN > {{search(float[] target, int topK, int fanout)}} method from {{VectorValues}} > to {{VectorValues.RandomAccess}}. This I think we could move back, and > handle the HNSW requirements for search elsewhere. I wonder if that would > alleviate the major concern here? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (LUCENE-9583) How should we expose VectorValues.RandomAccess?
[ https://issues.apache.org/jira/browse/LUCENE-9583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223684#comment-17223684 ] Michael Sokolov edited comment on LUCENE-9583 at 10/30/20, 2:45 PM: bq. Ok so we need dense ordinals because you expect to have lots of documents without a value ? Yes, think of parent/child index with vectors only on the parent bq. If random access is the way to go then we don't need the forward iterator but I agree with Julie that it maybe send the wrong message which is why I proposed to add the reset method. It's true we may not need a forward iterator. I'm not sure what you mean by the "wrong message" though. was (Author: sokolov): > Ok so we need dense ordinals because you expect to have lots of documents > without a value ? Yes, think of parent/child index with vectors only on the parent > If random access is the way to go then we don't need the forward iterator but > I agree with Julie that it maybe send the wrong message which is why I > proposed to add the reset method. It's true we may not need a forward iterator. I'm not sure what you mean by the "wrong message" though. > How should we expose VectorValues.RandomAccess? > --- > > Key: LUCENE-9583 > URL: https://issues.apache.org/jira/browse/LUCENE-9583 > Project: Lucene - Core > Issue Type: Improvement >Reporter: Michael Sokolov >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > In the newly-added {{VectorValues}} API, we have a {{RandomAccess}} > sub-interface. [~jtibshirani] pointed out this is not needed by some > vector-indexing strategies which can operate solely using a forward-iterator > (it is needed by HNSW), and so in the interest of simplifying the public API > we should not expose this internal detail (which by the way surfaces internal > ordinals that are somewhat uninteresting outside the random access API). > I looked into how to move this inside the HNSW-specific code and remembered > that we do also currently make use of the RA API when merging vector fields > over sorted indexes. Without it, we would need to load all vectors into RAM > while flushing/merging, as we currently do in > {{BinaryDocValuesWriter.BinaryDVs}}. I wonder if it's worth paying this cost > for the simpler API. > Another thing I noticed while reviewing this is that I moved the KNN > {{search(float[] target, int topK, int fanout)}} method from {{VectorValues}} > to {{VectorValues.RandomAccess}}. This I think we could move back, and > handle the HNSW requirements for search elsewhere. I wonder if that would > alleviate the major concern here? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14926) Modernize and clean up document clustering contrib
[ https://issues.apache.org/jira/browse/SOLR-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-14926: --- Fix Version/s: (was: 8.8) master (9.0) > Modernize and clean up document clustering contrib > -- > > Key: SOLR-14926 > URL: https://issues.apache.org/jira/browse/SOLR-14926 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > Fix For: master (9.0) > > > The clustering contrib was written a long time ago and it shows its age. We > have two separate interfaces (for clustering search results and for > clustering documents). The second one never received any attention and no > implementation exists for it. > I plan to do the following: > - *remove* the document clustering interface entirely, leave only the > post-search results clustering extension, > - upgrade the implementation to Carrot2 4.x (this gets rid of those > long-standing odd dependencies), > - clean up the code where appropriate. > My plan is to apply this to master, initially, but also backport to 8x if > there are no objections. I don't think it'll hurt anybody. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14974) Modernize and clean up document clustering contrib (8.x backport)
[ https://issues.apache.org/jira/browse/SOLR-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-14974: --- Fix Version/s: 8.8 > Modernize and clean up document clustering contrib (8.x backport) > - > > Key: SOLR-14974 > URL: https://issues.apache.org/jira/browse/SOLR-14974 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > Fix For: 8.8 > > > Backported code is at: > https://github.com/dweiss/lucene-solr/tree/SOLR-14974 > Will only compile and pass tests under Java 11. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14974) Modernize and clean up document clustering contrib (8.x backport)
[ https://issues.apache.org/jira/browse/SOLR-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-14974: --- Description: Backported code is at: https://github.com/dweiss/lucene-solr/tree/SOLR-14974 Will only compile and pass tests under Java 11. > Modernize and clean up document clustering contrib (8.x backport) > - > > Key: SOLR-14974 > URL: https://issues.apache.org/jira/browse/SOLR-14974 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > > Backported code is at: > https://github.com/dweiss/lucene-solr/tree/SOLR-14974 > Will only compile and pass tests under Java 11. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14974) Modernize and clean up document clustering contrib (8.x backport)
[ https://issues.apache.org/jira/browse/SOLR-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223671#comment-17223671 ] Dawid Weiss commented on SOLR-14974: Nah, brainfart. This won't work either because we rely on classes from those JARs compiled under Java 11 and they won't even be parsed by 1.8 compiler. I think this is a showstopper. > Modernize and clean up document clustering contrib (8.x backport) > - > > Key: SOLR-14974 > URL: https://issues.apache.org/jira/browse/SOLR-14974 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14974) Modernize and clean up document clustering contrib (8.x backport)
[ https://issues.apache.org/jira/browse/SOLR-14974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223667#comment-17223667 ] Dawid Weiss commented on SOLR-14974: The patch is there. One thing I didn't realize is that 8.x is still on Java 1.8... This means this cannot be backported (because dependency is on Java 11). I wonder what the consensus is for such cases (contribs relying on dependencies requiring Java > Solr core). I plan to add assumptions to tests so that they don't run on Java < 11 but otherwise I think it's still worth applying against Solr 8.x - this limits the number of CVE reports and just makes the contrib functional in general. > Modernize and clean up document clustering contrib (8.x backport) > - > > Key: SOLR-14974 > URL: https://issues.apache.org/jira/browse/SOLR-14974 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14974) Modernize and clean up document clustering contrib (8.x backport)
Dawid Weiss created SOLR-14974: -- Summary: Modernize and clean up document clustering contrib (8.x backport) Key: SOLR-14974 URL: https://issues.apache.org/jira/browse/SOLR-14974 Project: Solr Issue Type: Improvement Security Level: Public (Default Security Level. Issues are Public) Reporter: Dawid Weiss Assignee: Dawid Weiss -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-10825) [subquery] facets support
[ https://issues.apache.org/jira/browse/SOLR-10825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223654#comment-17223654 ] Munendra S N commented on SOLR-10825: - +1 to support this. Currently, user could pass facets and they get computed but won't be returned. We had some usecase to compute stats on subquery. So, we had to implement custom Doc transformer. I think we should return the total response returned by subquery to the user. This change would break backward compatibility Another thing, currently whole Solr document is converted as parameters to subquery which would be expensive in case of large fields (ideally shouldn't be stored in Solr). I understand the reason for passing whole doc as parameters is that we are not sure on which doc fields are being used in subquery. One workaround we have implemented in custom Doc transformer is to let user specify which fields to use from the main document, when not specified it defaults to current behavior. This might not be best solution (better would be to infer from subquery parameters) but helped in our case so, share it here > [subquery] facets support > -- > > Key: SOLR-10825 > URL: https://issues.apache.org/jira/browse/SOLR-10825 > Project: Solr > Issue Type: Improvement >Reporter: Mikhail Khludnev >Priority: Major > > Let's also have facets in subquery response. I'm not sure about a particular > format (where to nest a tag particularly), but I suppose it's worth to > encourage {{json.facet}} usage. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-14946) responseHeader is returned when using EmbeddedSolrServer even when omitHeader=true
[ https://issues.apache.org/jira/browse/SOLR-14946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Munendra S N resolved SOLR-14946. - Fix Version/s: 8.8 Resolution: Fixed > responseHeader is returned when using EmbeddedSolrServer even when > omitHeader=true > -- > > Key: SOLR-14946 > URL: https://issues.apache.org/jira/browse/SOLR-14946 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Munendra S N >Priority: Trivial > Fix For: 8.8 > > Time Spent: 10m > Remaining Estimate: 0h > > When omitHeader=true, responseHeader shouldn't be returned which is not the > case when using EmbededSolrServer -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14946) responseHeader is returned when using EmbeddedSolrServer even when omitHeader=true
[ https://issues.apache.org/jira/browse/SOLR-14946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223649#comment-17223649 ] ASF subversion and git services commented on SOLR-14946: Commit f93b2821c406a11c6d4400ed85ced6265c764740 in lucene-solr's branch refs/heads/branch_8x from Munendra S N [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f93b282 ] SOLR-14946: fix responseHeader returned in resp with omitHeader=true (#2029) * This occurs when BinaryResponseWriter#getParsedResponse is called as it doesn't check for omitHeader. > responseHeader is returned when using EmbeddedSolrServer even when > omitHeader=true > -- > > Key: SOLR-14946 > URL: https://issues.apache.org/jira/browse/SOLR-14946 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Munendra S N >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > When omitHeader=true, responseHeader shouldn't be returned which is not the > case when using EmbededSolrServer -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14946) responseHeader is returned when using EmbeddedSolrServer even when omitHeader=true
[ https://issues.apache.org/jira/browse/SOLR-14946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223644#comment-17223644 ] ASF subversion and git services commented on SOLR-14946: Commit f3fdd9b90b8e4af4c62708c1cd28a00347c1416e in lucene-solr's branch refs/heads/master from Munendra S N [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=f3fdd9b ] SOLR-14946: fix responseHeader returned in resp with omitHeader=true (#2029) * This occurs when BinaryResponseWriter#getParsedResponse is called as it doesn't check for omitHeader. > responseHeader is returned when using EmbeddedSolrServer even when > omitHeader=true > -- > > Key: SOLR-14946 > URL: https://issues.apache.org/jira/browse/SOLR-14946 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Munendra S N >Priority: Trivial > Time Spent: 10m > Remaining Estimate: 0h > > When omitHeader=true, responseHeader shouldn't be returned which is not the > case when using EmbededSolrServer -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14969) Prevent creating multiple cores with the same name which leads to instabilities (race condition)
[ https://issues.apache.org/jira/browse/SOLR-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223638#comment-17223638 ] Erick Erickson commented on SOLR-14969: --- Ah, thanks for the explanation. You're right, I'll fix that up. Good catch! > Prevent creating multiple cores with the same name which leads to > instabilities (race condition) > > > Key: SOLR-14969 > URL: https://issues.apache.org/jira/browse/SOLR-14969 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: multicore >Affects Versions: 8.6, 8.6.3 >Reporter: Andreas Hubold >Assignee: Erick Erickson >Priority: Major > Attachments: CmCoreAdminHandler.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > CoreContainer#create does not correctly handle concurrent requests to create > the same core. There's a race condition (see also existing TODO comment in > the code), and CoreContainer#createFromDescriptor may be called subsequently > for the same core name. > The _second call_ then fails to create an IndexWriter, and exception handling > causes an inconsistent CoreContainer state. > {noformat} > 2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error > CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core > [blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual > machine: /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312) > at > org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95) > at > org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367) > ... > Caused by: org.apache.solr.common.SolrException: Unable to create core > [blueprint_acgqqafsogyc_comments] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273) > ... 47 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1071) > at org.apache.solr.core.SolrCore.(SolrCore.java:906) > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387) > ... 48 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2184) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2308) > at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1130) > at org.apache.solr.core.SolrCore.(SolrCore.java:1012) > ... 50 more > Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by > this virtual machine: > /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at > org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:139) > at > org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) > at > org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) > at > org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:105) > at org.apache.lucene.index.IndexWriter.(IndexWriter.java:785) > at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:126) > at > org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100) > at > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:261) > at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:135) > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2145) > {noformat} > CoreContainer#createFromDescriptor removes the CoreDescriptor when handling > this exception. The SolrCore created for the first successful call is still > registered in SolrCores.cores, but now there's no corresponding > CoreDescriptor for that name anymore. > This inconsistency leads to subsequent NullPointerExceptions, for example > when using CoreAdmin STATUS with the core name: > CoreAdminOperation#getCoreStatus first gets the non-null SolrCore > (cores.getCore(cname)) but core.getInstancePath() throws an NPE, because the > CoreDescriptor is not registered anymore: > {noformat} > 2020-10-27 00:29:25.353 INFO (qtp2029754983-19) [ ] o.a.s.s.HttpSolrCall > [admin] webapp=null path=/admin/cores >
[jira] [Assigned] (SOLR-14926) Modernize and clean up document clustering contrib
[ https://issues.apache.org/jira/browse/SOLR-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss reassigned SOLR-14926: -- Assignee: Dawid Weiss > Modernize and clean up document clustering contrib > -- > > Key: SOLR-14926 > URL: https://issues.apache.org/jira/browse/SOLR-14926 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > Fix For: 8.8 > > > The clustering contrib was written a long time ago and it shows its age. We > have two separate interfaces (for clustering search results and for > clustering documents). The second one never received any attention and no > implementation exists for it. > I plan to do the following: > - *remove* the document clustering interface entirely, leave only the > post-search results clustering extension, > - upgrade the implementation to Carrot2 4.x (this gets rid of those > long-standing odd dependencies), > - clean up the code where appropriate. > My plan is to apply this to master, initially, but also backport to 8x if > there are no objections. I don't think it'll hurt anybody. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14969) Prevent creating multiple cores with the same name which leads to instabilities (race condition)
[ https://issues.apache.org/jira/browse/SOLR-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223623#comment-17223623 ] Andreas Hubold edited comment on SOLR-14969 at 10/30/20, 12:51 PM: --- [~erickerickson] It seems you've misunderstood my point. The core name must of course be removed in the finally block, if but only if it was added by the very same invocation. However, it *must not* be removed by an invocation, that didn't add the core name itself. Currently, with your fix, the following scenario is possible: * first call is made, and adds coreName to inFlightCreations * second simultaneous call detects that the core is already being created, and correctly throws an exception, but it also removes the core from inFlightCreations. * third simultaneous call is made, does not see coreName in inFlightCreations, and proceeds even though the first call is still not finished I hope this makes it more clear. In my attached workaround, this situation is handled by setting the variable "createCore" to null if a previous create call is still in progress, and an additional "if (createCore!=null)" condition in the finally block. This is of course not directly applicable to your fix, but the pattern could be similar. was (Author: ahubold): [~erickerickson] It seems you've misunderstood my point. The core name must of course be removed in the finally block, if but only if it was added by the very same invocation. However, it *must not* be removed by an invocation, that didn't add the core name itself. Currently, with your fix, the following scenario is possible: * first call is made, and adds coreName to inFlightCreations * second simultaneous call detects that the core is already being created, and correctly throws an exception, but it also removes the core from inFlightCreations. * third simultaneous call is made, does not see coreName in inFlightCreations, and proceeds even though the first call is still not finished I can reproduce such problems with concurrent create requests. I hope this makes it more clear. In my attached workaround, this situation is handled by setting the variable "createCore" to null if a previous create call is still in progress, and an additional "if (createCore!=null)" condition in the finally block. This is of course not directly applicable to your fix, but the pattern could be similar. > Prevent creating multiple cores with the same name which leads to > instabilities (race condition) > > > Key: SOLR-14969 > URL: https://issues.apache.org/jira/browse/SOLR-14969 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: multicore >Affects Versions: 8.6, 8.6.3 >Reporter: Andreas Hubold >Assignee: Erick Erickson >Priority: Major > Attachments: CmCoreAdminHandler.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > CoreContainer#create does not correctly handle concurrent requests to create > the same core. There's a race condition (see also existing TODO comment in > the code), and CoreContainer#createFromDescriptor may be called subsequently > for the same core name. > The _second call_ then fails to create an IndexWriter, and exception handling > causes an inconsistent CoreContainer state. > {noformat} > 2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error > CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core > [blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual > machine: /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312) > at > org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95) > at > org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367) > ... > Caused by: org.apache.solr.common.SolrException: Unable to create core > [blueprint_acgqqafsogyc_comments] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273) > ... 47 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1071) > at org.apache.solr.core.SolrCore.(SolrCore.java:906) > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387) > ... 48 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher >
[jira] [Updated] (SOLR-14926) Modernize and clean up document clustering contrib
[ https://issues.apache.org/jira/browse/SOLR-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss updated SOLR-14926: --- Fix Version/s: 8.8 > Modernize and clean up document clustering contrib > -- > > Key: SOLR-14926 > URL: https://issues.apache.org/jira/browse/SOLR-14926 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Priority: Major > Fix For: 8.8 > > > The clustering contrib was written a long time ago and it shows its age. We > have two separate interfaces (for clustering search results and for > clustering documents). The second one never received any attention and no > implementation exists for it. > I plan to do the following: > - *remove* the document clustering interface entirely, leave only the > post-search results clustering extension, > - upgrade the implementation to Carrot2 4.x (this gets rid of those > long-standing odd dependencies), > - clean up the code where appropriate. > My plan is to apply this to master, initially, but also backport to 8x if > there are no objections. I don't think it'll hurt anybody. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14926) Modernize and clean up document clustering contrib
[ https://issues.apache.org/jira/browse/SOLR-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223631#comment-17223631 ] Dawid Weiss commented on SOLR-14926: This is a rather large patch but I think the state it leaves Solr in is much better than before. The clustering extension now works in distributed mode, has fewer dependencies and just generally cleans up a lot of old, unused cruft that has accumulated over the years. If somebody wishes to take a look at the PR, please go ahead. I plan to create a backport PR soon and merge it next week if there are no objections. > Modernize and clean up document clustering contrib > -- > > Key: SOLR-14926 > URL: https://issues.apache.org/jira/browse/SOLR-14926 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Assignee: Dawid Weiss >Priority: Major > Fix For: 8.8 > > > The clustering contrib was written a long time ago and it shows its age. We > have two separate interfaces (for clustering search results and for > clustering documents). The second one never received any attention and no > implementation exists for it. > I plan to do the following: > - *remove* the document clustering interface entirely, leave only the > post-search results clustering extension, > - upgrade the implementation to Carrot2 4.x (this gets rid of those > long-standing odd dependencies), > - clean up the code where appropriate. > My plan is to apply this to master, initially, but also backport to 8x if > there are no objections. I don't think it'll hurt anybody. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14969) Prevent creating multiple cores with the same name which leads to instabilities (race condition)
[ https://issues.apache.org/jira/browse/SOLR-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223623#comment-17223623 ] Andreas Hubold commented on SOLR-14969: --- [~erickerickson] It seems you've misunderstood my point. The core name must of course be removed in the finally block, if but only if it was added by the very same invocation. However, it *must not* be removed by an invocation, that didn't add the core name itself. Currently, with your fix, the following scenario is possible: * first call is made, and adds coreName to inFlightCreations * second simultaneous call detects that the core is already being created, and correctly throws an exception, but it also removes the core from inFlightCreations. * third call is made, does not see coreName in inFlightCreations, and proceeds even though the first call is still not finished I can reproduce such problems with concurrent create requests. I hope this makes it more clear. In my attached workaround, this situation is handled by setting the variable "createCore" to null if a previous create call is still in progress, and an additional "if (createCore!=null)" condition in the finally block. This is of course not directly applicable to your fix, but the pattern could be similar. > Prevent creating multiple cores with the same name which leads to > instabilities (race condition) > > > Key: SOLR-14969 > URL: https://issues.apache.org/jira/browse/SOLR-14969 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: multicore >Affects Versions: 8.6, 8.6.3 >Reporter: Andreas Hubold >Assignee: Erick Erickson >Priority: Major > Attachments: CmCoreAdminHandler.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > CoreContainer#create does not correctly handle concurrent requests to create > the same core. There's a race condition (see also existing TODO comment in > the code), and CoreContainer#createFromDescriptor may be called subsequently > for the same core name. > The _second call_ then fails to create an IndexWriter, and exception handling > causes an inconsistent CoreContainer state. > {noformat} > 2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error > CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core > [blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual > machine: /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312) > at > org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95) > at > org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367) > ... > Caused by: org.apache.solr.common.SolrException: Unable to create core > [blueprint_acgqqafsogyc_comments] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273) > ... 47 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1071) > at org.apache.solr.core.SolrCore.(SolrCore.java:906) > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387) > ... 48 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2184) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2308) > at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1130) > at org.apache.solr.core.SolrCore.(SolrCore.java:1012) > ... 50 more > Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by > this virtual machine: > /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at > org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:139) > at > org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) > at > org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) > at > org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:105) > at org.apache.lucene.index.IndexWriter.(IndexWriter.java:785) > at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:126) > at > org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100) > at >
[jira] [Comment Edited] (SOLR-14969) Prevent creating multiple cores with the same name which leads to instabilities (race condition)
[ https://issues.apache.org/jira/browse/SOLR-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223623#comment-17223623 ] Andreas Hubold edited comment on SOLR-14969 at 10/30/20, 12:42 PM: --- [~erickerickson] It seems you've misunderstood my point. The core name must of course be removed in the finally block, if but only if it was added by the very same invocation. However, it *must not* be removed by an invocation, that didn't add the core name itself. Currently, with your fix, the following scenario is possible: * first call is made, and adds coreName to inFlightCreations * second simultaneous call detects that the core is already being created, and correctly throws an exception, but it also removes the core from inFlightCreations. * third simultaneous call is made, does not see coreName in inFlightCreations, and proceeds even though the first call is still not finished I can reproduce such problems with concurrent create requests. I hope this makes it more clear. In my attached workaround, this situation is handled by setting the variable "createCore" to null if a previous create call is still in progress, and an additional "if (createCore!=null)" condition in the finally block. This is of course not directly applicable to your fix, but the pattern could be similar. was (Author: ahubold): [~erickerickson] It seems you've misunderstood my point. The core name must of course be removed in the finally block, if but only if it was added by the very same invocation. However, it *must not* be removed by an invocation, that didn't add the core name itself. Currently, with your fix, the following scenario is possible: * first call is made, and adds coreName to inFlightCreations * second simultaneous call detects that the core is already being created, and correctly throws an exception, but it also removes the core from inFlightCreations. * third call is made, does not see coreName in inFlightCreations, and proceeds even though the first call is still not finished I can reproduce such problems with concurrent create requests. I hope this makes it more clear. In my attached workaround, this situation is handled by setting the variable "createCore" to null if a previous create call is still in progress, and an additional "if (createCore!=null)" condition in the finally block. This is of course not directly applicable to your fix, but the pattern could be similar. > Prevent creating multiple cores with the same name which leads to > instabilities (race condition) > > > Key: SOLR-14969 > URL: https://issues.apache.org/jira/browse/SOLR-14969 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: multicore >Affects Versions: 8.6, 8.6.3 >Reporter: Andreas Hubold >Assignee: Erick Erickson >Priority: Major > Attachments: CmCoreAdminHandler.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > CoreContainer#create does not correctly handle concurrent requests to create > the same core. There's a race condition (see also existing TODO comment in > the code), and CoreContainer#createFromDescriptor may be called subsequently > for the same core name. > The _second call_ then fails to create an IndexWriter, and exception handling > causes an inconsistent CoreContainer state. > {noformat} > 2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error > CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core > [blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual > machine: /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312) > at > org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95) > at > org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367) > ... > Caused by: org.apache.solr.common.SolrException: Unable to create core > [blueprint_acgqqafsogyc_comments] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273) > ... 47 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1071) > at org.apache.solr.core.SolrCore.(SolrCore.java:906) > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387) > ... 48 more > Caused by:
[jira] [Commented] (SOLR-14926) Modernize and clean up document clustering contrib
[ https://issues.apache.org/jira/browse/SOLR-14926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223619#comment-17223619 ] Dawid Weiss commented on SOLR-14926: Distributed mode rewritten from scratch. > Modernize and clean up document clustering contrib > -- > > Key: SOLR-14926 > URL: https://issues.apache.org/jira/browse/SOLR-14926 > Project: Solr > Issue Type: Improvement > Security Level: Public(Default Security Level. Issues are Public) >Reporter: Dawid Weiss >Priority: Major > > The clustering contrib was written a long time ago and it shows its age. We > have two separate interfaces (for clustering search results and for > clustering documents). The second one never received any attention and no > implementation exists for it. > I plan to do the following: > - *remove* the document clustering interface entirely, leave only the > post-search results clustering extension, > - upgrade the implementation to Carrot2 4.x (this gets rid of those > long-standing odd dependencies), > - clean up the code where appropriate. > My plan is to apply this to master, initially, but also backport to 8x if > there are no objections. I don't think it'll hurt anybody. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Resolved] (SOLR-13506) Upgrade carrot2-guava-*.jar
[ https://issues.apache.org/jira/browse/SOLR-13506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved SOLR-13506. Resolution: Won't Fix No longer relevant after SOLR-14926. > Upgrade carrot2-guava-*.jar > > > Key: SOLR-13506 > URL: https://issues.apache.org/jira/browse/SOLR-13506 > Project: Solr > Issue Type: Bug > Components: contrib - Clustering >Affects Versions: 7.7.1, 8.0, 8.1 >Reporter: DW >Assignee: Dawid Weiss >Priority: Major > > The Solr package contains /contrib/clustering/lib/carrot2-guava-18.0.jar. > [cpe:/a:google:guava:18.0|https://web.nvd.nist.gov/view/vuln/search-results?adv_search=true=on_version=cpe%3A%2Fa%3Agoogle%3Aguava%3A18.0] > has know security vulnerabilities. > Can you please upgrade the library or remove if not needed. > Thanks. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14969) Prevent creating multiple cores with the same name which leads to instabilities (race condition)
[ https://issues.apache.org/jira/browse/SOLR-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223612#comment-17223612 ] Erick Erickson commented on SOLR-14969: --- [~ahubold] I disagree, the core name must be removed from inflightCreations without fail. Its only purpose is to keep any other create request from starting to create the same core _for the duration of that create call only_. If the coreName was left in that structure, then the list would grow forever. And you couldn't create a core, delete it, then recreate it. Nor could you get partway through core creation, have an error, fix the error and then create the core. A second simultaneous call won't add the coreName to inflightCreations, it'll throw an exception at the top. So the originator is the only one who can, and must, remove it. If a core is created successfully and _another_ call is made to try to create the same core later, it fails this test a little later. {code:java} if (getAllCoreNames().contains(coreName)) { log.warn("Creating a core with existing name is not allowed"); // TODO: Shouldn't this be a BAD_REQUEST? throw new SolrException(ErrorCode.SERVER_ERROR, "Core with name '" + coreName + "' already exists."); } {code} If you still think the name should not be removed, we need to figure out why we disagree. > Prevent creating multiple cores with the same name which leads to > instabilities (race condition) > > > Key: SOLR-14969 > URL: https://issues.apache.org/jira/browse/SOLR-14969 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: multicore >Affects Versions: 8.6, 8.6.3 >Reporter: Andreas Hubold >Assignee: Erick Erickson >Priority: Major > Attachments: CmCoreAdminHandler.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > CoreContainer#create does not correctly handle concurrent requests to create > the same core. There's a race condition (see also existing TODO comment in > the code), and CoreContainer#createFromDescriptor may be called subsequently > for the same core name. > The _second call_ then fails to create an IndexWriter, and exception handling > causes an inconsistent CoreContainer state. > {noformat} > 2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error > CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core > [blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual > machine: /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312) > at > org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95) > at > org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367) > ... > Caused by: org.apache.solr.common.SolrException: Unable to create core > [blueprint_acgqqafsogyc_comments] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273) > ... 47 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1071) > at org.apache.solr.core.SolrCore.(SolrCore.java:906) > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387) > ... 48 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2184) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2308) > at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1130) > at org.apache.solr.core.SolrCore.(SolrCore.java:1012) > ... 50 more > Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by > this virtual machine: > /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at > org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:139) > at > org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) > at > org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) > at > org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:105) > at org.apache.lucene.index.IndexWriter.(IndexWriter.java:785) > at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:126) > at >
[jira] [Updated] (SOLR-14973) Solr 8.6 is shipping libraries that are incompatible with each other
[ https://issues.apache.org/jira/browse/SOLR-14973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samir Huremovic updated SOLR-14973: --- Description: Hi, since Solr 8.6 the version of {{tika-parsers}} was updated to {{1.24}}. This version of {{tika-parsers}} needs the {{poi}} library in version {{4.1.2}} (see https://issues.apache.org/jira/browse/TIKA-3047) Solr has version {{4.1.1}} of poi included. This creates (at least) a problem for parsing {{.xls}} files. The following exception gets thrown by trying to post an {{.xls}} file in the techproducts example: {{java.lang.NoSuchMethodError: org.apache.poi.hssf.record.common.UnicodeString.getExtendedRst()Lorg/apache/poi/hssf/record/common/ExtRst;}} was: Hi, since Solr 8.6 the version of `tika-parsers` was updated to `1.24`. This version of `tika-parsers` needs the `poi` library in version `4.1.2` (see https://issues.apache.org/jira/browse/TIKA-3047) Solr has version `4.1.1` of poi included. This creates (at least) a problem for parsing `.xls` files. The following exception gets thrown by trying to post an `.xls` file in the techproducts example: ``` java.lang.NoSuchMethodError: org.apache.poi.hssf.record.common.UnicodeString.getExtendedRst()Lorg/apache/poi/hssf/record/common/ExtRst; ``` > Solr 8.6 is shipping libraries that are incompatible with each other > > > Key: SOLR-14973 > URL: https://issues.apache.org/jira/browse/SOLR-14973 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: contrib - Solr Cell (Tika extraction) >Affects Versions: 8.6 >Reporter: Samir Huremovic >Priority: Major > Labels: tika-parsers > > Hi, > since Solr 8.6 the version of {{tika-parsers}} was updated to {{1.24}}. This > version of {{tika-parsers}} needs the {{poi}} library in version {{4.1.2}} > (see https://issues.apache.org/jira/browse/TIKA-3047) > Solr has version {{4.1.1}} of poi included. > This creates (at least) a problem for parsing {{.xls}} files. The following > exception gets thrown by trying to post an {{.xls}} file in the techproducts > example: > {{java.lang.NoSuchMethodError: > org.apache.poi.hssf.record.common.UnicodeString.getExtendedRst()Lorg/apache/poi/hssf/record/common/ExtRst;}} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Created] (SOLR-14973) Solr 8.6 is shipping libraries that are incompatible with each other
Samir Huremovic created SOLR-14973: -- Summary: Solr 8.6 is shipping libraries that are incompatible with each other Key: SOLR-14973 URL: https://issues.apache.org/jira/browse/SOLR-14973 Project: Solr Issue Type: Bug Security Level: Public (Default Security Level. Issues are Public) Components: contrib - Solr Cell (Tika extraction) Affects Versions: 8.6 Reporter: Samir Huremovic Hi, since Solr 8.6 the version of `tika-parsers` was updated to `1.24`. This version of `tika-parsers` needs the `poi` library in version `4.1.2` (see https://issues.apache.org/jira/browse/TIKA-3047) Solr has version `4.1.1` of poi included. This creates (at least) a problem for parsing `.xls` files. The following exception gets thrown by trying to post an `.xls` file in the techproducts example: ``` java.lang.NoSuchMethodError: org.apache.poi.hssf.record.common.UnicodeString.getExtendedRst()Lorg/apache/poi/hssf/record/common/ExtRst; ``` -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Comment Edited] (SOLR-14837) prometheus-exporter: different metrics ports publishes mixed metrics
[ https://issues.apache.org/jira/browse/SOLR-14837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223572#comment-17223572 ] Fadi Mohsen edited comment on SOLR-14837 at 10/30/20, 11:12 AM: Not sure about the process, I cloned from master, setup eclipse and got a couple of import warnings (sneaked in in patch.log), just: {noformat} git apply patch.log{noformat} was (Author: fadi.moh...@gmail.com): Not sure about the process, I cloned from master, setup eclipse and got a couple of import warnings (sneaked in in patch.log), Attaching patch.log, output of git diff. {noformat} git diff > patch.log{noformat} > prometheus-exporter: different metrics ports publishes mixed metrics > > > Key: SOLR-14837 > URL: https://issues.apache.org/jira/browse/SOLR-14837 > Project: Solr > Issue Type: Improvement > Components: contrib - prometheus-exporter >Affects Versions: 8.6.2 >Reporter: Fadi Mohsen >Priority: Minor > Attachments: patch.log > > > when calling SolrExporter.main "pro-grammatically"/"same JVM" with two > different solr masters asking to publish the metrics on two different ports, > the metrics are being mixed on both metric endpoints from the two solr > masters. > This was tracked down to a static variable called *defaultRegistry*: > https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/src/java/org/apache/solr/prometheus/exporter/SolrExporter.java#L86 > removing the static keyword fixes the issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14837) prometheus-exporter: different metrics ports publishes mixed metrics
[ https://issues.apache.org/jira/browse/SOLR-14837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Fadi Mohsen updated SOLR-14837: --- Attachment: patch.log > prometheus-exporter: different metrics ports publishes mixed metrics > > > Key: SOLR-14837 > URL: https://issues.apache.org/jira/browse/SOLR-14837 > Project: Solr > Issue Type: Improvement > Components: contrib - prometheus-exporter >Affects Versions: 8.6.2 >Reporter: Fadi Mohsen >Priority: Minor > Attachments: patch.log > > > when calling SolrExporter.main "pro-grammatically"/"same JVM" with two > different solr masters asking to publish the metrics on two different ports, > the metrics are being mixed on both metric endpoints from the two solr > masters. > This was tracked down to a static variable called *defaultRegistry*: > https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/src/java/org/apache/solr/prometheus/exporter/SolrExporter.java#L86 > removing the static keyword fixes the issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14837) prometheus-exporter: different metrics ports publishes mixed metrics
[ https://issues.apache.org/jira/browse/SOLR-14837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223572#comment-17223572 ] Fadi Mohsen commented on SOLR-14837: Not sure about the process, I cloned from master, setup eclipse and got a couple of import warnings (sneaked in in patch.log), Attaching patch.log, output of git diff. {noformat} git diff > patch.log{noformat} > prometheus-exporter: different metrics ports publishes mixed metrics > > > Key: SOLR-14837 > URL: https://issues.apache.org/jira/browse/SOLR-14837 > Project: Solr > Issue Type: Improvement > Components: contrib - prometheus-exporter >Affects Versions: 8.6.2 >Reporter: Fadi Mohsen >Priority: Minor > > when calling SolrExporter.main "pro-grammatically"/"same JVM" with two > different solr masters asking to publish the metrics on two different ports, > the metrics are being mixed on both metric endpoints from the two solr > masters. > This was tracked down to a static variable called *defaultRegistry*: > https://github.com/apache/lucene-solr/blob/master/solr/contrib/prometheus-exporter/src/java/org/apache/solr/prometheus/exporter/SolrExporter.java#L86 > removing the static keyword fixes the issue. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] defonion commented on pull request #2042: SOLR-14961: ZkMaintenanceUtils.clean doesn't remove zk-nodes with same path length
defonion commented on pull request #2042: URL: https://github.com/apache/lucene-solr/pull/2042#issuecomment-719463236 I added the changes recommended by @madrob This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Updated] (SOLR-14969) Prevent creating multiple cores with the same name which leads to instabilities (race condition)
[ https://issues.apache.org/jira/browse/SOLR-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Hubold updated SOLR-14969: -- Attachment: CmCoreAdminHandler.java > Prevent creating multiple cores with the same name which leads to > instabilities (race condition) > > > Key: SOLR-14969 > URL: https://issues.apache.org/jira/browse/SOLR-14969 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: multicore >Affects Versions: 8.6, 8.6.3 >Reporter: Andreas Hubold >Assignee: Erick Erickson >Priority: Major > Attachments: CmCoreAdminHandler.java > > Time Spent: 1h 20m > Remaining Estimate: 0h > > CoreContainer#create does not correctly handle concurrent requests to create > the same core. There's a race condition (see also existing TODO comment in > the code), and CoreContainer#createFromDescriptor may be called subsequently > for the same core name. > The _second call_ then fails to create an IndexWriter, and exception handling > causes an inconsistent CoreContainer state. > {noformat} > 2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error > CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core > [blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual > machine: /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312) > at > org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95) > at > org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367) > ... > Caused by: org.apache.solr.common.SolrException: Unable to create core > [blueprint_acgqqafsogyc_comments] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273) > ... 47 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1071) > at org.apache.solr.core.SolrCore.(SolrCore.java:906) > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387) > ... 48 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2184) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2308) > at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1130) > at org.apache.solr.core.SolrCore.(SolrCore.java:1012) > ... 50 more > Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by > this virtual machine: > /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at > org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:139) > at > org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) > at > org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) > at > org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:105) > at org.apache.lucene.index.IndexWriter.(IndexWriter.java:785) > at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:126) > at > org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100) > at > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:261) > at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:135) > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2145) > {noformat} > CoreContainer#createFromDescriptor removes the CoreDescriptor when handling > this exception. The SolrCore created for the first successful call is still > registered in SolrCores.cores, but now there's no corresponding > CoreDescriptor for that name anymore. > This inconsistency leads to subsequent NullPointerExceptions, for example > when using CoreAdmin STATUS with the core name: > CoreAdminOperation#getCoreStatus first gets the non-null SolrCore > (cores.getCore(cname)) but core.getInstancePath() throws an NPE, because the > CoreDescriptor is not registered anymore: > {noformat} > 2020-10-27 00:29:25.353 INFO (qtp2029754983-19) [ ] o.a.s.s.HttpSolrCall > [admin] webapp=null path=/admin/cores > params={core=blueprint_acgqqafsogyc_comments=STATUS=false=javabin=2} > status=500 QTime=0 >
[jira] [Updated] (SOLR-14969) Prevent creating multiple cores with the same name which leads to instabilities (race condition)
[ https://issues.apache.org/jira/browse/SOLR-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Hubold updated SOLR-14969: -- Attachment: (was: CmCoreAdminHandler.java) > Prevent creating multiple cores with the same name which leads to > instabilities (race condition) > > > Key: SOLR-14969 > URL: https://issues.apache.org/jira/browse/SOLR-14969 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: multicore >Affects Versions: 8.6, 8.6.3 >Reporter: Andreas Hubold >Assignee: Erick Erickson >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > CoreContainer#create does not correctly handle concurrent requests to create > the same core. There's a race condition (see also existing TODO comment in > the code), and CoreContainer#createFromDescriptor may be called subsequently > for the same core name. > The _second call_ then fails to create an IndexWriter, and exception handling > causes an inconsistent CoreContainer state. > {noformat} > 2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error > CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core > [blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual > machine: /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312) > at > org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95) > at > org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367) > ... > Caused by: org.apache.solr.common.SolrException: Unable to create core > [blueprint_acgqqafsogyc_comments] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273) > ... 47 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1071) > at org.apache.solr.core.SolrCore.(SolrCore.java:906) > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387) > ... 48 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2184) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2308) > at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1130) > at org.apache.solr.core.SolrCore.(SolrCore.java:1012) > ... 50 more > Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by > this virtual machine: > /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at > org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:139) > at > org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) > at > org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) > at > org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:105) > at org.apache.lucene.index.IndexWriter.(IndexWriter.java:785) > at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:126) > at > org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100) > at > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:261) > at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:135) > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2145) > {noformat} > CoreContainer#createFromDescriptor removes the CoreDescriptor when handling > this exception. The SolrCore created for the first successful call is still > registered in SolrCores.cores, but now there's no corresponding > CoreDescriptor for that name anymore. > This inconsistency leads to subsequent NullPointerExceptions, for example > when using CoreAdmin STATUS with the core name: > CoreAdminOperation#getCoreStatus first gets the non-null SolrCore > (cores.getCore(cname)) but core.getInstancePath() throws an NPE, because the > CoreDescriptor is not registered anymore: > {noformat} > 2020-10-27 00:29:25.353 INFO (qtp2029754983-19) [ ] o.a.s.s.HttpSolrCall > [admin] webapp=null path=/admin/cores > params={core=blueprint_acgqqafsogyc_comments=STATUS=false=javabin=2} > status=500 QTime=0 > 2020-10-27 00:29:25.353 ERROR
[GitHub] [lucene-solr] ahubold commented on a change in pull request #2049: SOLR-14969: Prevent creating multiple cores with the same name which leads to instabilities (race condition)
ahubold commented on a change in pull request #2049: URL: https://github.com/apache/lucene-solr/pull/2049#discussion_r514931450 ## File path: solr/core/src/java/org/apache/solr/core/CoreContainer.java ## @@ -1253,77 +1254,90 @@ public SolrCore create(String coreName, Map parameters) { * @return the newly created core */ public SolrCore create(String coreName, Path instancePath, Map parameters, boolean newCollection) { - -CoreDescriptor cd = new CoreDescriptor(coreName, instancePath, parameters, getContainerProperties(), getZkController()); - -// TODO: There's a race here, isn't there? -// Since the core descriptor is removed when a core is unloaded, it should never be anywhere when a core is created. -if (getAllCoreNames().contains(coreName)) { - log.warn("Creating a core with existing name is not allowed"); - // TODO: Shouldn't this be a BAD_REQUEST? - throw new SolrException(ErrorCode.SERVER_ERROR, "Core with name '" + coreName + "' already exists."); -} - -// Validate paths are relative to known locations to avoid path traversal -assertPathAllowed(cd.getInstanceDir()); -assertPathAllowed(Paths.get(cd.getDataDir())); - -boolean preExisitingZkEntry = false; try { - if (getZkController() != null) { -if (cd.getCloudDescriptor().getCoreNodeName() == null) { - throw new SolrException(ErrorCode.SERVER_ERROR, "coreNodeName missing " + parameters.toString()); + synchronized (inFlightCreations) { +if (inFlightCreations.contains(coreName)) { + String msg = "Already creating a core with name '" + coreName + "', call aborted '"; + log.warn(msg); + throw new SolrException(ErrorCode.SERVER_ERROR, msg); } -preExisitingZkEntry = getZkController().checkIfCoreNodeNameAlreadyExists(cd); +inFlightCreations.add(coreName); + } + CoreDescriptor cd = new CoreDescriptor(coreName, instancePath, parameters, getContainerProperties(), getZkController()); + + // TODO: There's a race here, isn't there? + // Since the core descriptor is removed when a core is unloaded, it should never be anywhere when a core is created. + if (getAllCoreNames().contains(coreName)) { +log.warn("Creating a core with existing name is not allowed"); +// TODO: Shouldn't this be a BAD_REQUEST? +throw new SolrException(ErrorCode.SERVER_ERROR, "Core with name '" + coreName + "' already exists."); } - // Much of the logic in core handling pre-supposes that the core.properties file already exists, so create it - // first and clean it up if there's an error. - coresLocator.create(this, cd); + // Validate paths are relative to known locations to avoid path traversal + assertPathAllowed(cd.getInstanceDir()); + assertPathAllowed(Paths.get(cd.getDataDir())); - SolrCore core = null; + boolean preExisitingZkEntry = false; try { -solrCores.waitAddPendingCoreOps(cd.getName()); -core = createFromDescriptor(cd, true, newCollection); -coresLocator.persist(this, cd); // Write out the current core properties in case anything changed when the core was created - } finally { -solrCores.removeFromPendingOps(cd.getName()); - } +if (getZkController() != null) { + if (cd.getCloudDescriptor().getCoreNodeName() == null) { +throw new SolrException(ErrorCode.SERVER_ERROR, "coreNodeName missing " + parameters.toString()); + } + preExisitingZkEntry = getZkController().checkIfCoreNodeNameAlreadyExists(cd); +} - return core; -} catch (Exception ex) { - // First clean up any core descriptor, there should never be an existing core.properties file for any core that - // failed to be created on-the-fly. - coresLocator.delete(this, cd); - if (isZooKeeperAware() && !preExisitingZkEntry) { +// Much of the logic in core handling pre-supposes that the core.properties file already exists, so create it +// first and clean it up if there's an error. +coresLocator.create(this, cd); + +SolrCore core = null; try { - getZkController().unregister(coreName, cd); -} catch (InterruptedException e) { - Thread.currentThread().interrupt(); - SolrException.log(log, null, e); -} catch (KeeperException e) { - SolrException.log(log, null, e); -} catch (Exception e) { - SolrException.log(log, null, e); + solrCores.waitAddPendingCoreOps(cd.getName()); + core = createFromDescriptor(cd, true, newCollection); + coresLocator.persist(this, cd); // Write out the current core properties in case anything changed when the core was created +} finally { + solrCores.removeFromPendingOps(cd.getName()); } - } - Throwable tc = ex; -
[jira] [Commented] (SOLR-14969) Prevent creating multiple cores with the same name which leads to instabilities (race condition)
[ https://issues.apache.org/jira/browse/SOLR-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223501#comment-17223501 ] Andreas Hubold commented on SOLR-14969: --- For reference, I've attached my workaround (tested with Solr 8.6.3) in the form of a custom CoreAdminHandler subclass: [^CmCoreAdminHandler.java] It works similar to Erick's fix, but it can't fix the problem for async CREATE request. If anybody wants to use it, please change the package name, and register it in your solr.xml with {{...}}. > Prevent creating multiple cores with the same name which leads to > instabilities (race condition) > > > Key: SOLR-14969 > URL: https://issues.apache.org/jira/browse/SOLR-14969 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: multicore >Affects Versions: 8.6, 8.6.3 >Reporter: Andreas Hubold >Assignee: Erick Erickson >Priority: Major > Attachments: CmCoreAdminHandler.java > > Time Spent: 1h 10m > Remaining Estimate: 0h > > CoreContainer#create does not correctly handle concurrent requests to create > the same core. There's a race condition (see also existing TODO comment in > the code), and CoreContainer#createFromDescriptor may be called subsequently > for the same core name. > The _second call_ then fails to create an IndexWriter, and exception handling > causes an inconsistent CoreContainer state. > {noformat} > 2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error > CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core > [blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual > machine: /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312) > at > org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95) > at > org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367) > ... > Caused by: org.apache.solr.common.SolrException: Unable to create core > [blueprint_acgqqafsogyc_comments] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273) > ... 47 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1071) > at org.apache.solr.core.SolrCore.(SolrCore.java:906) > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387) > ... 48 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2184) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2308) > at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1130) > at org.apache.solr.core.SolrCore.(SolrCore.java:1012) > ... 50 more > Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by > this virtual machine: > /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at > org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:139) > at > org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) > at > org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) > at > org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:105) > at org.apache.lucene.index.IndexWriter.(IndexWriter.java:785) > at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:126) > at > org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100) > at > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:261) > at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:135) > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2145) > {noformat} > CoreContainer#createFromDescriptor removes the CoreDescriptor when handling > this exception. The SolrCore created for the first successful call is still > registered in SolrCores.cores, but now there's no corresponding > CoreDescriptor for that name anymore. > This inconsistency leads to subsequent NullPointerExceptions, for example > when using CoreAdmin STATUS with the core name: > CoreAdminOperation#getCoreStatus first gets the non-null SolrCore >
[jira] [Updated] (SOLR-14969) Prevent creating multiple cores with the same name which leads to instabilities (race condition)
[ https://issues.apache.org/jira/browse/SOLR-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Hubold updated SOLR-14969: -- Attachment: CmCoreAdminHandler.java > Prevent creating multiple cores with the same name which leads to > instabilities (race condition) > > > Key: SOLR-14969 > URL: https://issues.apache.org/jira/browse/SOLR-14969 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: multicore >Affects Versions: 8.6, 8.6.3 >Reporter: Andreas Hubold >Assignee: Erick Erickson >Priority: Major > Attachments: CmCoreAdminHandler.java > > Time Spent: 1h 10m > Remaining Estimate: 0h > > CoreContainer#create does not correctly handle concurrent requests to create > the same core. There's a race condition (see also existing TODO comment in > the code), and CoreContainer#createFromDescriptor may be called subsequently > for the same core name. > The _second call_ then fails to create an IndexWriter, and exception handling > causes an inconsistent CoreContainer state. > {noformat} > 2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error > CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core > [blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual > machine: /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312) > at > org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95) > at > org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367) > ... > Caused by: org.apache.solr.common.SolrException: Unable to create core > [blueprint_acgqqafsogyc_comments] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273) > ... 47 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1071) > at org.apache.solr.core.SolrCore.(SolrCore.java:906) > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387) > ... 48 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2184) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2308) > at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1130) > at org.apache.solr.core.SolrCore.(SolrCore.java:1012) > ... 50 more > Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by > this virtual machine: > /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at > org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:139) > at > org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) > at > org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) > at > org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:105) > at org.apache.lucene.index.IndexWriter.(IndexWriter.java:785) > at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:126) > at > org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100) > at > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:261) > at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:135) > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2145) > {noformat} > CoreContainer#createFromDescriptor removes the CoreDescriptor when handling > this exception. The SolrCore created for the first successful call is still > registered in SolrCores.cores, but now there's no corresponding > CoreDescriptor for that name anymore. > This inconsistency leads to subsequent NullPointerExceptions, for example > when using CoreAdmin STATUS with the core name: > CoreAdminOperation#getCoreStatus first gets the non-null SolrCore > (cores.getCore(cname)) but core.getInstancePath() throws an NPE, because the > CoreDescriptor is not registered anymore: > {noformat} > 2020-10-27 00:29:25.353 INFO (qtp2029754983-19) [ ] o.a.s.s.HttpSolrCall > [admin] webapp=null path=/admin/cores > params={core=blueprint_acgqqafsogyc_comments=STATUS=false=javabin=2} > status=500 QTime=0 >
[jira] [Commented] (SOLR-14969) Prevent creating multiple cores with the same name which leads to instabilities (race condition)
[ https://issues.apache.org/jira/browse/SOLR-14969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223492#comment-17223492 ] Andreas Hubold commented on SOLR-14969: --- Thanks for the fast response. The PR still contains a small error that the coreName is removed from inFlightCores, even if it was added by a previous call. Funny thing is, I made exactly the same error in my first attempt for a workaround in a custom CoreAdminHandler. > Prevent creating multiple cores with the same name which leads to > instabilities (race condition) > > > Key: SOLR-14969 > URL: https://issues.apache.org/jira/browse/SOLR-14969 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Components: multicore >Affects Versions: 8.6, 8.6.3 >Reporter: Andreas Hubold >Assignee: Erick Erickson >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > CoreContainer#create does not correctly handle concurrent requests to create > the same core. There's a race condition (see also existing TODO comment in > the code), and CoreContainer#createFromDescriptor may be called subsequently > for the same core name. > The _second call_ then fails to create an IndexWriter, and exception handling > causes an inconsistent CoreContainer state. > {noformat} > 2020-10-27 00:29:25.350 ERROR (qtp2029754983-24) [ ] > o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: Error > CREATEing SolrCore 'blueprint_acgqqafsogyc_comments': Unable to create core > [blueprint_acgqqafsogyc_comments] Caused by: Lock held by this virtual > machine: /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1312) > at > org.apache.solr.handler.admin.CoreAdminOperation.lambda$static$0(CoreAdminOperation.java:95) > at > org.apache.solr.handler.admin.CoreAdminOperation.execute(CoreAdminOperation.java:367) > ... > Caused by: org.apache.solr.common.SolrException: Unable to create core > [blueprint_acgqqafsogyc_comments] > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1408) > at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1273) > ... 47 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.(SolrCore.java:1071) > at org.apache.solr.core.SolrCore.(SolrCore.java:906) > at > org.apache.solr.core.CoreContainer.createFromDescriptor(CoreContainer.java:1387) > ... 48 more > Caused by: org.apache.solr.common.SolrException: Error opening new searcher > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2184) > at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:2308) > at org.apache.solr.core.SolrCore.initSearcher(SolrCore.java:1130) > at org.apache.solr.core.SolrCore.(SolrCore.java:1012) > ... 50 more > Caused by: org.apache.lucene.store.LockObtainFailedException: Lock held by > this virtual machine: > /var/solr/data/blueprint_acgqqafsogyc_comments/data/index/write.lock > at > org.apache.lucene.store.NativeFSLockFactory.obtainFSLock(NativeFSLockFactory.java:139) > at > org.apache.lucene.store.FSLockFactory.obtainLock(FSLockFactory.java:41) > at > org.apache.lucene.store.BaseDirectory.obtainLock(BaseDirectory.java:45) > at > org.apache.lucene.store.FilterDirectory.obtainLock(FilterDirectory.java:105) > at org.apache.lucene.index.IndexWriter.(IndexWriter.java:785) > at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:126) > at > org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:100) > at > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:261) > at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:135) > at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:2145) > {noformat} > CoreContainer#createFromDescriptor removes the CoreDescriptor when handling > this exception. The SolrCore created for the first successful call is still > registered in SolrCores.cores, but now there's no corresponding > CoreDescriptor for that name anymore. > This inconsistency leads to subsequent NullPointerExceptions, for example > when using CoreAdmin STATUS with the core name: > CoreAdminOperation#getCoreStatus first gets the non-null SolrCore > (cores.getCore(cname)) but core.getInstancePath() throws an NPE, because the > CoreDescriptor is not registered anymore: > {noformat}
[GitHub] [lucene-solr] ahubold commented on a change in pull request #2049: SOLR-14969: Prevent creating multiple cores with the same name which leads to instabilities (race condition)
ahubold commented on a change in pull request #2049: URL: https://github.com/apache/lucene-solr/pull/2049#discussion_r514931450 ## File path: solr/core/src/java/org/apache/solr/core/CoreContainer.java ## @@ -1253,77 +1254,90 @@ public SolrCore create(String coreName, Map parameters) { * @return the newly created core */ public SolrCore create(String coreName, Path instancePath, Map parameters, boolean newCollection) { - -CoreDescriptor cd = new CoreDescriptor(coreName, instancePath, parameters, getContainerProperties(), getZkController()); - -// TODO: There's a race here, isn't there? -// Since the core descriptor is removed when a core is unloaded, it should never be anywhere when a core is created. -if (getAllCoreNames().contains(coreName)) { - log.warn("Creating a core with existing name is not allowed"); - // TODO: Shouldn't this be a BAD_REQUEST? - throw new SolrException(ErrorCode.SERVER_ERROR, "Core with name '" + coreName + "' already exists."); -} - -// Validate paths are relative to known locations to avoid path traversal -assertPathAllowed(cd.getInstanceDir()); -assertPathAllowed(Paths.get(cd.getDataDir())); - -boolean preExisitingZkEntry = false; try { - if (getZkController() != null) { -if (cd.getCloudDescriptor().getCoreNodeName() == null) { - throw new SolrException(ErrorCode.SERVER_ERROR, "coreNodeName missing " + parameters.toString()); + synchronized (inFlightCreations) { +if (inFlightCreations.contains(coreName)) { + String msg = "Already creating a core with name '" + coreName + "', call aborted '"; + log.warn(msg); + throw new SolrException(ErrorCode.SERVER_ERROR, msg); } -preExisitingZkEntry = getZkController().checkIfCoreNodeNameAlreadyExists(cd); +inFlightCreations.add(coreName); + } + CoreDescriptor cd = new CoreDescriptor(coreName, instancePath, parameters, getContainerProperties(), getZkController()); + + // TODO: There's a race here, isn't there? + // Since the core descriptor is removed when a core is unloaded, it should never be anywhere when a core is created. + if (getAllCoreNames().contains(coreName)) { +log.warn("Creating a core with existing name is not allowed"); +// TODO: Shouldn't this be a BAD_REQUEST? +throw new SolrException(ErrorCode.SERVER_ERROR, "Core with name '" + coreName + "' already exists."); } - // Much of the logic in core handling pre-supposes that the core.properties file already exists, so create it - // first and clean it up if there's an error. - coresLocator.create(this, cd); + // Validate paths are relative to known locations to avoid path traversal + assertPathAllowed(cd.getInstanceDir()); + assertPathAllowed(Paths.get(cd.getDataDir())); - SolrCore core = null; + boolean preExisitingZkEntry = false; try { -solrCores.waitAddPendingCoreOps(cd.getName()); -core = createFromDescriptor(cd, true, newCollection); -coresLocator.persist(this, cd); // Write out the current core properties in case anything changed when the core was created - } finally { -solrCores.removeFromPendingOps(cd.getName()); - } +if (getZkController() != null) { + if (cd.getCloudDescriptor().getCoreNodeName() == null) { +throw new SolrException(ErrorCode.SERVER_ERROR, "coreNodeName missing " + parameters.toString()); + } + preExisitingZkEntry = getZkController().checkIfCoreNodeNameAlreadyExists(cd); +} - return core; -} catch (Exception ex) { - // First clean up any core descriptor, there should never be an existing core.properties file for any core that - // failed to be created on-the-fly. - coresLocator.delete(this, cd); - if (isZooKeeperAware() && !preExisitingZkEntry) { +// Much of the logic in core handling pre-supposes that the core.properties file already exists, so create it +// first and clean it up if there's an error. +coresLocator.create(this, cd); + +SolrCore core = null; try { - getZkController().unregister(coreName, cd); -} catch (InterruptedException e) { - Thread.currentThread().interrupt(); - SolrException.log(log, null, e); -} catch (KeeperException e) { - SolrException.log(log, null, e); -} catch (Exception e) { - SolrException.log(log, null, e); + solrCores.waitAddPendingCoreOps(cd.getName()); + core = createFromDescriptor(cd, true, newCollection); + coresLocator.persist(this, cd); // Write out the current core properties in case anything changed when the core was created +} finally { + solrCores.removeFromPendingOps(cd.getName()); } - } - Throwable tc = ex; -
[GitHub] [lucene-solr] CaoManhDat edited a comment on pull request #2026: LUCENE-8626: Standardize Lucene Test Files
CaoManhDat edited a comment on pull request #2026: URL: https://github.com/apache/lucene-solr/pull/2026#issuecomment-719286855 Thank you @MarcusSorealheis and everyone This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] CaoManhDat commented on pull request #2026: LUCENE-8626: Standardize Lucene Test Files
CaoManhDat commented on pull request #2026: URL: https://github.com/apache/lucene-solr/pull/2026#issuecomment-719286855 Thank you @MarcusSorealheis This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (LUCENE-8626) standardise test class naming
[ https://issues.apache.org/jira/browse/LUCENE-8626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223444#comment-17223444 ] ASF subversion and git services commented on LUCENE-8626: - Commit 57729c9acaace26f37644c42a0b0889508e589ba in lucene-solr's branch refs/heads/master from Marcus [ https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=57729c9 ] LUCENE-8626: Standardize Lucene Test Files (#2026) > standardise test class naming > - > > Key: LUCENE-8626 > URL: https://issues.apache.org/jira/browse/LUCENE-8626 > Project: Lucene - Core > Issue Type: Test >Reporter: Christine Poerschke >Priority: Major > Attachments: SOLR-12939.01.patch, SOLR-12939.02.patch, > SOLR-12939.03.patch, SOLR-12939_hoss_validation_groovy_experiment.patch > > Time Spent: 2h 10m > Remaining Estimate: 0h > > This was mentioned and proposed on the dev mailing list. Starting this ticket > here to start to make it happen? > History: This ticket was created as > https://issues.apache.org/jira/browse/SOLR-12939 ticket and then got > JIRA-moved to become https://issues.apache.org/jira/browse/LUCENE-8626 ticket. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[GitHub] [lucene-solr] CaoManhDat merged pull request #2026: LUCENE-8626: Standardize Lucene Test Files
CaoManhDat merged pull request #2026: URL: https://github.com/apache/lucene-solr/pull/2026 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org
[jira] [Commented] (SOLR-14034) remove deprecated min_rf references
[ https://issues.apache.org/jira/browse/SOLR-14034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17223427#comment-17223427 ] Tim Dillon commented on SOLR-14034: --- [~marcussorealheis] I see it has been a while since there was any activity on this, is it still available to work on? I'm new here as well and this seems like a good place to get started. > remove deprecated min_rf references > --- > > Key: SOLR-14034 > URL: https://issues.apache.org/jira/browse/SOLR-14034 > Project: Solr > Issue Type: Task >Reporter: Christine Poerschke >Priority: Blocker > Labels: newdev > Fix For: master (9.0) > > > * {{min_rf}} support was added under SOLR-5468 in version 4.9 > (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/4.9.0/solr/solrj/src/java/org/apache/solr/client/solrj/request/UpdateRequest.java#L50) > and deprecated under SOLR-12767 in version 7.6 > (https://github.com/apache/lucene-solr/blob/releases/lucene-solr/7.6.0/solr/solrj/src/java/org/apache/solr/client/solrj/request/UpdateRequest.java#L57-L61) > * http://lucene.apache.org/solr/7_6_0/changes/Changes.html and > https://lucene.apache.org/solr/guide/8_0/major-changes-in-solr-8.html#solr-7-6 > both clearly mention the deprecation > This ticket is to fully remove {{min_rf}} references in code, tests and > documentation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org