I would also like to know if this is possible. >From going over the github page, it seems there is a JMX method to force the creation of a snapshot. Yet the docker image is configured as such that a port will never be assigned to the JMX process.
Is there any way to bypass this? On Tue, Jul 30, 2019 at 8:51 AM Jörn Franke <jornfra...@gmail.com> wrote: > Thanks. It is possible to force Zookeeper to create a snapshot? I will > check I think the snapshot count is set to 1 in the cfg > > > Am 30.07.2019 um 08:06 schrieb Enrico Olivelli <eolive...@gmail.com>: > > > > Il giorno lun 29 lug 2019 alle ore 23:59 Jörn Franke < > jornfra...@gmail.com> > > ha scritto: > > > >> ok, then let me verify tomorrow if a snapshot file is indeed there. If > it > >> is missing then I wonder why it was missing. There was no crash or > whatever > >> and 3.4.14 works without issue, but of course it could have loaded them > >> from the log files. However, then I wonder why it does not create one. > >> > > > > > > > > I remember now that some other user, I think Sijie, reported a similar > > problem some month ago, that it is not possible to upgrade from 3.4 to > 3.5 > > if no snapshot is present. > > IIRC The fix was to force the creation of at least one snapshot file and > > then upgrade > > > > Enrico > > > > > >> > >> On Mon, Jul 29, 2019 at 11:45 PM Michael Han <h...@apache.org> wrote: > >> > >>>>> I just wonder why it does not find a valid snapshot. > >>> > >>> If there are local snapshot files and the files are valid, then it's a > >> bug > >>> that server fails to load them. > >>> > >>>>> Is it because the format changed in 3.5.5 compared to 3.4.14? > >>> > >>> Not I am aware of. There are some format changes (added compression > >>> support) in master branch, but that's not shipped with 3.5.5. > >>> > >>> > >>> > >>> On Mon, Jul 29, 2019 at 2:31 PM Jörn Franke <jornfra...@gmail.com> > >> wrote: > >>> > >>>> ok, then it affects basically all standalone nodes? This is fine, > >> despite > >>>> that it means some extra work (for uncritical lab environments). > >>>> I am not sure it is ZOOKEEPER-2325, but I don't know the full history > >>>> behind it).The logs are fine (it works in 3.4.14 without issues, even > >>> after > >>>> downgrading back). There is no issue with disk space and there are no > 0 > >>>> byte files. I just wonder why it does not find a valid snapshot. Is > it > >>>> because the format changed in 3.5.5 compared to 3.4.14? > >>>> > >>>> On Mon, Jul 29, 2019 at 11:25 PM Michael Han <h...@apache.org> wrote: > >>>> > >>>>>>> java.io.IOException: No snapshot found, but there are log entries. > >>>>> Something is broken! > >>>>> > >>>>> This is expected behavior introduced in ZOOKEEPER-2325. We don't want > >>> to > >>>>> end up with potential inconsistent state across the ensemble when > >>>>> recovering from empty snapshot. > >>>>> > >>>>> To continue upgrade, just delete all txn log files and let the node > >>> sync > >>>>> the snapshot from the quorum. > >>>>> > >>>>> > >>>>> On Mon, Jul 29, 2019 at 1:38 PM Enrico Olivelli <eolive...@gmail.com > >>> > >>>>> wrote: > >>>>> > >>>>>> Il lun 29 lug 2019, 22:32 Jörn Franke <jornfra...@gmail.com> ha > >>>> scritto: > >>>>>> > >>>>>>> It also seems that 3.5.5 does not attempt to read all of the > >>> logfiles > >>>>> (I > >>>>>>> have to still confirm), but the two it reads exist, it has access > >>> and > >>>>>> they > >>>>>>> are much more than 0 byte > >>>>>>> > >>>>>> > >>>>>> We should have the stackstace of the EOFException. > >>>>>> > >>>>>> Anyone on this list has a better idea? > >>>>>> > >>>>>> Enrico > >>>>>> > >>>>>> > >>>>>>> On Mon, Jul 29, 2019 at 10:13 PM Jörn Franke < > >> jornfra...@gmail.com > >>>> > >>>>>> wrote: > >>>>>>> > >>>>>>>> (of course i do not run them at the same time) > >>>>>>>> > >>>>>>>> On Mon, Jul 29, 2019 at 10:10 PM Jörn Franke < > >>> jornfra...@gmail.com > >>>>> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>>> thank you for the quick reply. They read from the same disk > >>> paths > >>>>> and > >>>>>>>>> have the same access rights (in fact the RHEL service executes > >>>> them > >>>>> as > >>>>>>> the > >>>>>>>>> same specific user). > >>>>>>>>> > >>>>>>>>> On Mon, Jul 29, 2019 at 10:09 PM Enrico Olivelli < > >>>>> eolive...@gmail.com > >>>>>>> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Il lun 29 lug 2019, 21:50 Jörn Franke <jornfra...@gmail.com> > >>> ha > >>>>>>> scritto: > >>>>>>>>>> > >>>>>>>>>>> Hi, > >>>>>>>>>>> > >>>>>>>>>>> I tried to migrate a lab environment from Zookeepr 3.4.14 > >>> (used > >>>>> for > >>>>>>>>>> Solr) > >>>>>>>>>>> to 3.5.5 and encountered an issue. It is ZooKeeper in > >>>> standalone > >>>>>> mode > >>>>>>>>>>> (other environments have a proper ensemble). I increased > >>>>>>> jute.maxbuffer > >>>>>>>>>>> beyond the default (but not excessively) - this was working > >>>>>> perfectly > >>>>>>>>>> fine > >>>>>>>>>>> in 3.4.14. > >>>>>>>>>>> > >>>>>>>>>>> Basically I reuse for the migration the same config files, > >>>> except > >>>>>>> that > >>>>>>>>>> I > >>>>>>>>>>> whitelist some commands (later I am also interested in > >> adding > >>>>> SSL). > >>>>>>>>>>> > >>>>>>>>>>> I have the following error message when starting Zookeeper > >>> with > >>>>>> 3.5.5 > >>>>>>>>>>> (basically, I just changed the symboling link from > >> zookeeper > >>> to > >>>>>> point > >>>>>>>>>> to > >>>>>>>>>>> 3.5.5 instead of the 3.4.14 directory: > >>>>>>>>>>> 2019-07-29 15:16:25,217 [myid:] - DEBUG > >>>>>>>>>>> [main:FileTxnLog$FileTxnIterator@655] > >>>>>>>>>>> - Created new input stream /zookeeper/version-2/log.b34 > >>>>>>>>>>> 2019-07-29 15:16:25,217 [myid:] - DEBUG > >>>>>>>>>>> [main:FileTxnLog$FileTxnIterator@658] > >>>>>>>>>>> - Created new input archive /zookeeper/version-2/log.b34 > >>>>>>>>>>> 2019-07-29 15:16:25,222 [myid:] - DEBUG > >>>>>>>>>>> [main:FileTxnLog$FileTxnIterator@696] > >>>>>>>>>>> - EOF exception java.io.EOFException: Failed to read > >>>>>>>>>>> /zookeeper/version-2/log.b34 > >>>>>>>>>>> 2019-07-29 15:16:25,223 [myid:] - DEBUG > >>>>>>>>>>> [main:FileTxnLog$FileTxnIterator@655] > >>>>>>>>>>> - Created new input stream /zookeeper/version-2/log.b72 > >>>>>>>>>>> 2019-07-29 15:16:25,223 [myid:] - DEBUG > >>>>>>>>>>> [main:FileTxnLog$FileTxnIterator@658] > >>>>>>>>>>> - Created new input archive /zookeeper/version-2/log.b72 > >>>>>>>>>>> 2019-07-29 15:16:25,224 [myid:] - DEBUG > >>>>>>>>>>> [main:FileTxnLog$FileTxnIterator@696] > >>>>>>>>>>> - EOF exception java.io.EOFException: Failed to read > >>>>>>>>>>> /zookeeper/version-2/log.b72 > >>>>>>>>>>> 2019-07-29 15:16:25,224 [myid:] - ERROR > >>>>>> [main:ZooKeeperServerMain@83 > >>>>>>> ] > >>>>>>>>>> - > >>>>>>>>>>> Unexpected exception, exiting abnormally > >>>>>>>>>>> java.io.IOException: No snapshot found, but there are log > >>>>> entries. > >>>>>>>>>>> Something is broken! > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:211) > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>> > >>> > org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240) > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:290) > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:450) > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:764) > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > org.apache.zookeeper.server.ServerCnxnFactory.startup(ServerCnxnFactory.java:98) > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:144) > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:106) > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:64) > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:128) > >>>>>>>>>>> at > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82) > >>>>>>>>>>> > >>>>>>>>>>> Strangely enough, if I switch back to 3.4.14 the issue is > >>>>> resolved > >>>>>>> and > >>>>>>>>>>> Zookeeper works normally. However, I would like to leverage > >>> the > >>>>> new > >>>>>>>>>> version > >>>>>>>>>>> 3.5.5. > >>>>>>>>>>> > >>>>>>>>>>> There are no 0 bytes files. Disk space is plenty available. > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Can you compare these logs with logs of 3.4.x ? Are they > >>> reading > >>>>>> from > >>>>>>>>>> the > >>>>>>>>>> same disk paths? > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>>> Any idea beyond erasing the data dir (I would try to avoid > >>> it, > >>>> I > >>>>>> can > >>>>>>>>>>> reconstruct it, but still)? I will try also in the other > >>>>>>> environments > >>>>>>>>>> and > >>>>>>>>>>> also with an environment with an ensemble, but i would like > >>> to > >>>>> know > >>>>>>>>>> before > >>>>>>>>>>> what the issue could be. > >>>>>>>>>>> > >>>>>>>>>>> Not sure if it is relevant, but: > >>>>>>>>>>> Activated Kerberos Authentication and Kerberos SSL for > >>> clients > >>>>> and > >>>>>>>>>> quorum. > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Quorum? In standalone mode there is no 'quorum' auth > >>>>>>>>>> > >>>>>>>>>> Enrico > >>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>> > >>>>> > >>>> > >>> > >> >