+1 SUCCESS! [1:36:12.056443] On Tue, Jun 15, 2021 at 4:26 PM Mayya Sharipova <[email protected]> wrote:
> Thanks Robert for such detailed investigations. > > Lucene-Solr-SmokeRelease-8.9 also had 2 recent failures. Failures are not > reproducible on my local machine. > > build #13: ant test -Dtestcase=SolrCloudReportersTest > -Dtests.method=testExplicitConfiguration -Dtests.seed=60FEAB39C2B47705 > -Dtests.multiplier=2 -Dtests.locale=ro-RO > -Dtests.timezone=Africa/Brazzaville -Dtests.asserts=true > -Dtests.file.encoding=UTF-8 > build# 12: ant test -Dtestcase=LeaderTragicEventTest > -Dtests.method=testLeaderFailsOver -Dtests.seed=BB301A174F4BDB5 > -Dtests.multiplier=2 -Dtests.locale=sr-Latn-RS > -Dtests.timezone=Africa/Bissau -Dtests.asserts=true > -Dtests.file.encoding=ISO-8859-1 > > On Sat, Jun 12, 2021 at 5:06 PM Robert Muir <[email protected]> wrote: > >> OK I managed to finally get this smoketester to pass on my machine, so >> for THIS release I will retract my -1 and change it to a +1. >> >> I have reset my system configuration back though, so we should really >> fix these test problems for the future. >> >> SUCCESS! [1:08:26.448122] >> >> There were a few compounding issues, I will break out some issues a >> bit later. I don't think they need to be blockers for THIS release, >> but let's please fix them! I can help try to dig on each one, but here >> are the biggest two problems: >> >> 1. some solr tests don't obey their sandbox and fail with >> tests.workDir (if it is set in the user's build.properties). These >> tests try to access wrong parts of the filesystem which can cause >> tests to meddle with each other. obeying the test sandbox >> (tests.workDir) is important, it is how I prevent these tests from >> destroying my SSDs. >> >> 2. some solr HDFS tests will falsely fail if they "think" disk space >> is low (even when it is not running out). They dump megabytes of >> output, but this part is the key: >> >> [junit4] 2> 1000960 WARN (IPC Server handler 3 on 33951) [ ] >> o.a.h.h.s.b.BlockPlacementPolicy Failed to place enough replicas, >> still in need of 2 to reach 2 (unavailableStorages=[], >> storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], >> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) >> For more information, please enable DEBUG log level on >> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and >> org.apache.hadoop.net.NetworkTopology >> [junit4] 2> 1000960 WARN (IPC Server handler 3 on 33951) [ ] >> o.a.h.h.p.BlockStoragePolicy Failed to place enough replicas: expected >> size is 2 but only 0 storage types can be selected (replication=2, >> selected=[], unavailable=[DISK], removed=[DISK, DISK], >> policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], >> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}) >> [junit4] 2> 1000960 WARN (IPC Server handler 3 on 33951) [ ] >> o.a.h.h.s.b.BlockPlacementPolicy Failed to place enough replicas, >> still in need of 2 to reach 2 (unavailableStorages=[DISK], >> storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], >> creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) >> All required storage types are unavailable: >> unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, >> storageTypes=[DISK], creationFallbacks=[], >> replicationFallbacks=[ARCHIVE]} >> [junit4] 2> 1000961 WARN (Thread-2642) [ ] >> o.a.h.h.DataStreamer DataStreamer Exception >> [junit4] 2> => >> org.apache.hadoop.ipc.RemoteException(java.io.IOException): File >> /testfile could only be written to 0 of the 1 minReplication nodes. >> There are 2 datanode(s) running and 2 node(s) are excluded in this >> operation. >> >> So I think these tests should be tweaked to not require gigabytes of >> free space to pass. (fix the threshold or whatever, or add an assume >> or something). I worked around the situation by temporarily >> repartitioning and giving them another gigabyte (!). In no event was >> there ever any danger of running out of space! They just falsely fail >> even when there are hundreds of MB available. Seems they have some >> kind of bogus threshold in the algorithm (e.g. inspecting percentages >> or something). >> >> On Sat, Jun 12, 2021 at 12:22 PM Robert Muir <[email protected]> wrote: >> > >> > The tests also aren't "timing out". They are failing. >> > >> > On Sat, Jun 12, 2021 at 12:21 PM Robert Muir <[email protected]> wrote: >> > > >> > > Ishan, no, they arent running out of resources, not even close. I have >> > > 20GB of ram and by default it is only using 3 JVMs. >> > > >> > > On Sat, Jun 12, 2021 at 12:04 PM Ishan Chattopadhyaya >> > > <[email protected]> wrote: >> > > > >> > > > Hi Rob, could it be possible that the tests are timing out on your >> machine due to lack of resources? Can you try running them with just just >> one JVM at a time? >> > > > >> > > > On Sat, 12 Jun, 2021, 8:20 pm Robert Muir, <[email protected]> >> wrote: >> > > >> >> > > >> I ran smoketester yet one more time and again numerous tests fail: >> > > >> >> > > >> [junit4] Tests with failures [seed: A3FDDCE09965D7AE] (first 10 >> out of 18): >> > > >> [junit4] - >> org.apache.solr.update.TestHdfsUpdateLog.testFSThreadSafety >> > > >> [junit4] - org.apache.solr.update.TestHdfsUpdateLog (suite) >> > > >> [junit4] - >> > > >> >> org.apache.solr.cloud.hdfs.HDFSCollectionsAPITest.testDataDirIsNotReused >> > > >> [junit4] - >> org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest.testBasic >> > > >> [junit4] - >> > > >> org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest.testMultiThreaded >> > > >> [junit4] - org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest >> (suite) >> > > >> [junit4] - >> > > >> >> org.apache.solr.core.backup.repository.HdfsBackupRepositoryIntegrationTest.testCanDistinguishBetweenFilesAndDirectories >> > > >> [junit4] - >> > > >> >> org.apache.solr.core.backup.repository.HdfsBackupRepositoryIntegrationTest.testCanDeleteEmptyOrFullDirectories >> > > >> [junit4] - >> > > >> >> org.apache.solr.core.backup.repository.HdfsBackupRepositoryIntegrationTest.testCanDeleteIndividualFiles >> > > >> [junit4] - >> > > >> >> org.apache.solr.core.backup.repository.HdfsBackupRepositoryIntegrationTest.testArbitraryFileDataCanBeStoredAndRetrieved >> > > >> [junit4] >> > > >> [junit4] >> > > >> [junit4] JVM J0: 0.72 .. 1241.80 = 1241.08s >> > > >> [junit4] JVM J1: 0.67 .. 1198.72 = 1198.05s >> > > >> [junit4] JVM J2: 0.91 .. 1198.75 = 1197.84s >> > > >> [junit4] Execution time total: 20 minutes 41 seconds >> > > >> [junit4] Tests summary: 939 suites (5 ignored), 4884 tests, 3 >> > > >> suite-level errors, 15 errors, 1 failure, 2581 ignored (506 >> > > >> assumptions) >> > > >> >> > > >> On Fri, Jun 11, 2021 at 11:52 AM Robert Muir <[email protected]> >> wrote: >> > > >> > >> > > >> > After nuking all settings (i simply removed the whole >> > > >> > lucene.build.properties in my homedir), it still fails. Seems >> maybe >> > > >> > like less failures though? >> > > >> > >> > > >> > I will upload logs to the JIRA issue. >> > > >> > >> > > >> > [junit4] Completed [939/939 (4!)] on J2 in 383.02s, 2 tests, 1 >> > > >> > failure <<< FAILURES! >> > > >> > [junit4] >> > > >> > [junit4] >> > > >> > [junit4] Tests with failures [seed: AC205159663D0461]: >> > > >> > [junit4] - >> org.apache.solr.update.TestHdfsUpdateLog.testFSThreadSafety >> > > >> > [junit4] - org.apache.solr.update.TestHdfsUpdateLog (suite) >> > > >> > [junit4] - >> > > >> > >> org.apache.solr.core.HdfsDirectoryFactoryTest.testLocalityReporter >> > > >> > [junit4] - >> org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest.testBasic >> > > >> > [junit4] - >> > > >> > org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest.testMultiThreaded >> > > >> > [junit4] - org.apache.solr.cloud.hdfs.HdfsRecoverLeaseTest >> (suite) >> > > >> > [junit4] - >> > > >> > >> org.apache.solr.cloud.api.collections.TestLocalFSCloudBackupRestore.test >> > > >> > [junit4] >> > > >> > [junit4] >> > > >> > [junit4] JVM J0: 0.68 .. 1197.30 = 1196.62s >> > > >> > [junit4] JVM J1: 0.71 .. 1113.59 = 1112.89s >> > > >> > [junit4] JVM J2: 0.68 .. 1406.83 = 1406.15s >> > > >> > [junit4] Execution time total: 23 minutes 26 seconds >> > > >> > [junit4] Tests summary: 939 suites (5 ignored), 4884 tests, 3 >> > > >> > suite-level errors, 4 errors, 1 failure, 2457 ignored (517 >> > > >> > assumptions) >> > > >> > >> > > >> > BUILD FAILED >> > > >> > >> /tmp/smoke_lucene_8.9.0_05c8a6f0163fe4c330e93775e8e91f3ab66a3f80/unpack/solr-8.9.0/solr/build.xml:231: >> > > >> > The following error occurred while executing this line: >> > > >> > >> /tmp/smoke_lucene_8.9.0_05c8a6f0163fe4c330e93775e8e91f3ab66a3f80/unpack/solr-8.9.0/solr/common-build.xml:550: >> > > >> > The following error occurred while executing this line: >> > > >> > >> /tmp/smoke_lucene_8.9.0_05c8a6f0163fe4c330e93775e8e91f3ab66a3f80/unpack/solr-8.9.0/lucene/common-build.xml:1608: >> > > >> > The following error occurred while executing this line: >> > > >> > >> /tmp/smoke_lucene_8.9.0_05c8a6f0163fe4c330e93775e8e91f3ab66a3f80/unpack/solr-8.9.0/lucene/common-build.xml:1135: >> > > >> > There were test failures: 939 suites (5 ignored), 4884 tests, 3 >> > > >> > suite-level errors, 4 errors, 1 failure, 2457 ignored (517 >> > > >> > assumptions) [seed: AC205159663D0461] >> > > >> > >> > > >> > Total time: 24 minutes 16 seconds >> > > >> > >> > > >> > >> > > >> > Traceback (most recent call last): >> > > >> > File >> "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py", >> > > >> > line 1495, in <module> >> > > >> > main() >> > > >> > File >> "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py", >> > > >> > line 1417, in main >> > > >> > smokeTest(c.java, c.url, c.revision, c.version, c.tmp_dir, >> > > >> > c.is_signed, c.local_keys, ' '.join(c.test_args), >> > > >> > File >> "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py", >> > > >> > line 1483, in smokeTest >> > > >> > solrSrcUnpackPath = unpackAndVerify(java, 'solr', tmpDir, >> > > >> > 'solr-%s-src.tgz' % version, >> > > >> > File >> "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py", >> > > >> > line 566, in unpackAndVerify >> > > >> > verifyUnpacked(java, project, artifact, unpackPath, >> gitRevision, >> > > >> > version, testArgs, tmpDir, baseURL) >> > > >> > File >> "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py", >> > > >> > line 687, in verifyUnpacked >> > > >> > java.run_java8('ant clean test -Dtests.slow=false %s' % >> testArgs, >> > > >> > '%s/test.log' % unpackPath) >> > > >> > File >> "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py", >> > > >> > line 1212, in run_java >> > > >> > run('%s; %s' % (cmd_prefix, cmd), logfile) >> > > >> > File >> "/home/rmuir/workspace/lucene-solr/dev-tools/scripts/smokeTestRelease.py", >> > > >> > line 500, in run >> > > >> > raise RuntimeError('command "%s" failed; see log file %s' % >> > > >> > (command, logPath)) >> > > >> > RuntimeError: command "export >> > > >> > JAVA_HOME="/home/rmuir/Downloads/jdk8u282-b08" >> > > >> > PATH="/home/rmuir/Downloads/jdk8u282-b08/bin:$PATH" >> > > >> > JAVACMD="/home/rmuir/Downloads/jdk8u282-b08/bin/java"; ant clean >> test >> > > >> > -Dtests.slow=false -Dtests.badapples=false " failed; see log file >> > > >> > >> /tmp/smoke_lucene_8.9.0_05c8a6f0163fe4c330e93775e8e91f3ab66a3f80/unpack/solr-8.9.0/test.log >> > > >> > >> > > >> > On Fri, Jun 11, 2021 at 9:47 AM Robert Muir <[email protected]> >> wrote: >> > > >> > > >> > > >> > > I nuked all my settings and am rerunning with all defaults. >> I'll >> > > >> > > report back what happens/upload log when/if it finishes or >> fails. >> > > >> > > >> > > >> > > On Fri, Jun 11, 2021 at 9:45 AM Michael Sokolov < >> [email protected]> wrote: >> > > >> > > > >> > > >> > > > I tried to comment on the JIRA, but it seems to be timing >> out. Now >> > > >> > > > when I go back, SOLR issues are marked as "You can't view >> this issue >> > > >> > > > It may have been deleted or you don't have permission to >> view it." >> > > >> > > > Waat? >> > > >> > > > >> > > >> > > > Anyway, Robert you suggested there that maybe the problem is >> being >> > > >> > > > surfaced by using a different working directory for the >> tests. Do you >> > > >> > > > think that the tests need to be fixed so that they work with >> this >> > > >> > > > tmp.workDir parameter? What if you were to cd to the place >> you want to >> > > >> > > > use as the working dir and call the smokeTester from there? >> > > >> > > > >> > > >> > > > >> > > >> > > > On Fri, Jun 11, 2021 at 9:29 AM Mayya Sharipova >> > > >> > > > <[email protected]> wrote: >> > > >> > > > > >> > > >> > > > > Thanks very much Robert for detailed investigations, and >> thanks Jan for your tests. >> > > >> > > > > >> > > >> > > > > I will sort out the problem with my GPG key, but I am not >> sure what to do with this SOLR-15473. I've run the smoker test again, and >> it passed on my Mac again: SUCCESS! [1:00:03.751500] >> > > >> > > > > Would appreciate more guidance, if we need to resolve >> SOLR-15473 before 8.9 release. >> > > >> > > > > >> > > >> > > > > >> > > >> > > > > On Fri, Jun 11, 2021 at 8:09 AM Robert Muir < >> [email protected]> wrote: >> > > >> > > > >> >> > > >> > > > >> Dude, if you can vote +1 when the smoketester passes, >> then I can vote >> > > >> > > > >> -1 when it fails. This is my vote, not your vote. You >> don't get to >> > > >> > > > >> decide about it, or change it in any way. >> > > >> > > > >> >> > > >> > > > >> On Fri, Jun 11, 2021 at 8:04 AM Jan Høydahl < >> [email protected]> wrote: >> > > >> > > > >> > >> > > >> > > > >> > Does it reproduce for you? Are you suspecting a bug in >> Solr that we cannot ship, or only a bug in the smoketester py itself? The >> -1 should be about the released bits, not about other tooling? >> > > >> > > > >> > My JVM is OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build >> 25.292-b10, mixed mode) >> > > >> > > > >> > >> > > >> > > > >> > Jan >> > > >> > > > >> > >> > > >> > > > >> > > 11. jun. 2021 kl. 13:48 skrev Robert Muir < >> [email protected]>: >> > > >> > > > >> > > >> > > >> > > > >> > > Jan, I'm using the same automated smoketester as >> everyone else. It >> > > >> > > > >> > > fails, so my vote is -1. >> > > >> > > > >> > > >> > > >> > > > >> > > On Fri, Jun 11, 2021 at 7:22 AM Jan Høydahl < >> [email protected]> wrote: >> > > >> > > > >> > >> >> > > >> > > > >> > >> Tested on MacOS (Intel), No other verification than >> smoketester done >> > > >> > > > >> > >> >> > > >> > > > >> > >> SUCCESS! [1:08:19.953492] >> > > >> > > > >> > >> >> > > >> > > > >> > >> +1 >> > > >> > > > >> > >> >> > > >> > > > >> > >> Robert - not sure if one test-run failure should >> cancel the build. Our smoketester and tests are sometimes a bit picky, and >> does not mean that the artifacts are faulty. >> > > >> > > > >> > >> >> > > >> > > > >> > >> Jan >> > > >> > > > >> > >> >> > > >> > > > >> > >> 11. jun. 2021 kl. 04:14 skrev Mayya Sharipova < >> [email protected]>: >> > > >> > > > >> > >> >> > > >> > > > >> > >> Please vote for release candidate 1 for Lucene/Solr >> 8.9.0 >> > > >> > > > >> > >> >> > > >> > > > >> > >> The artifacts can be downloaded from: >> > > >> > > > >> > >> >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.9.0-RC1-rev05c8a6f0163fe4c330e93775e8e91f3ab66a3f80 >> > > >> > > > >> > >> >> > > >> > > > >> > >> You can run the smoke tester directly with this >> command: >> > > >> > > > >> > >> >> > > >> > > > >> > >> python3 -u dev-tools/scripts/smokeTestRelease.py \ >> > > >> > > > >> > >> >> https://dist.apache.org/repos/dist/dev/lucene/lucene-solr-8.9.0-RC1-rev05c8a6f0163fe4c330e93775e8e91f3ab66a3f80 >> > > >> > > > >> > >> >> > > >> > > > >> > >> The vote will be open for at least 72 hours i.e. >> until 2021-06-16 02:00 UTC. >> > > >> > > > >> > >> >> > > >> > > > >> > >> [ ] +1 approve >> > > >> > > > >> > >> [ ] +0 no opinion >> > > >> > > > >> > >> [ ] -1 disapprove (and reason why) >> > > >> > > > >> > >> >> > > >> > > > >> > >> Here is my +1 >> > > >> > > > >> > >> SUCCESS! [0:01:43.815224] >> > > >> > > > >> > >> >> > > >> > > > >> > >> >> > > >> > > > >> > > >> > > >> > > > >> > > >> --------------------------------------------------------------------- >> > > >> > > > >> > > To unsubscribe, e-mail: >> [email protected] >> > > >> > > > >> > > For additional commands, e-mail: >> [email protected] >> > > >> > > > >> > > >> > > >> > > > >> > >> > > >> > > > >> > >> > > >> > > > >> > >> --------------------------------------------------------------------- >> > > >> > > > >> > To unsubscribe, e-mail: >> [email protected] >> > > >> > > > >> > For additional commands, e-mail: >> [email protected] >> > > >> > > > >> > >> > > >> > > > >> >> > > >> > > > >> >> --------------------------------------------------------------------- >> > > >> > > > >> To unsubscribe, e-mail: [email protected] >> > > >> > > > >> For additional commands, e-mail: >> [email protected] >> > > >> > > > >> >> > > >> > > > >> > > >> > > > >> --------------------------------------------------------------------- >> > > >> > > > To unsubscribe, e-mail: [email protected] >> > > >> > > > For additional commands, e-mail: [email protected] >> > > >> > > > >> > > >> >> > > >> >> --------------------------------------------------------------------- >> > > >> To unsubscribe, e-mail: [email protected] >> > > >> For additional commands, e-mail: [email protected] >> > > >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> -- Adrien
