warperwolf opened a new pull request #324: URL: https://github.com/apache/solr/pull/324
https://issues.apache.org/jira/browse/SOLR-14660 # Description This PR moves HDFS from Core to a Contrib module. Note: I kept separate commits to make it easier to review and to change / revert if necessary. They can be squashed once everything is polished and we have a green light. # Solution - A new contrib module has been created with its own build - HDFS classes have been moved to the new module - This PR does not yet take advantage of the Solr packaging system, this is just a separate module. Implementing as a package will be a next step. - HDFS directory direct references have been removed from Solr core (for example the UpdateHandler had a direct reference to HdfsDirectoryFactory, removed it so it works the same way as other directory factories and can also be loaded as a plugin (core.getResourceLoader().newInstance(ulogPluginInfo, UpdateLog.class, true);) - Solr Core still has some Hadoop-related (but non HDFS-related) classes (mostly Hadoop authentication), these are still left in Core , if they need to be separated, they will be handled in a separate jira - The block cache feature was only used by HDFS, moved to the contrib module) - The only change needed to use this module is to include its jar on the classpath (for example symlinking the jar to the web-inf/lib of the webapp), no changes to the solrconfig.xml are needed. - DirectoryFactory had DirectoryFactory.java: public final static String LOCK_TYPE_HDFS = "hdfs"; , but this was only referenced from the tests and from HdfsDirectoryFactory, removed it. - Removed the deprecated flags from the code and the warning from the ref guide. - Changed reference in ref guide to the new location of HdfsDirectoryFactory. - Added a simple readme markdown doc # Tests - Existing tests have been moved to the new contrib module where needed - Test have been refactored where a Hdfs test was extending a Core test class. For example, HdfsChaosMonkeyNothingIsSafeTest (hdfs test) extends ChaosMonkeyNothingIsSafeTest (core test). Gradle does not allow dependencies between test projects of 2 different modules, so in such cases I introduced an abstract base class (here AbstractChaosMonkeyNothingIsSafeTest) which is extended by both. A good place to store these classes would be the test fixtures feature of gradle which is especially designed for this purpose, however some of the gradle plugins Solr uses fail because they are not compatible with fixtures (see https://issues.apache.org/jira/browse/SOLR-14660?focusedCommentId=17409690&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17409690 ), so I moved these classes to the Solr Test Framework project. - Solr core had some helper classes which are core related but they are also referenced by the hdfs classes (for example the MockCoreContainer.java) - moved those to gradle test fixtures of the Core project, and later moved them to the test framework - I also created a Cloudera CDP test release and built a cluster with HDFS and Solr and tested there - TestBackupRepositoryFactory had some references to HdfsBackupRepository, but the test case is not hdfs related. Removed that part. - HDFSRecoveryZKTest was also failing before this change. Added an AwaitsFix annotation, needs to be fixed separately. # Checklist Please review the following and check all that apply: - [x] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/solr/HowToContribute) and my code conforms to the standards described there to the best of my ability. - [x] I have created a Jira issue and added the issue ID to my pull request title. - [x] I have given Solr maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended) - [x] I have developed this patch against the `main` branch. - [x] I have run `./gradlew check`. - [x] I have added tests for my changes. - [x] I have added documentation for the [Reference Guide](https://github.com/apache/solr/tree/main/solr/solr-ref-guide) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org