and apologies for the full email signature... Dave
On Tue, Jul 30, 2024 at 12:49 PM David Early <[email protected]> wrote: > I am going to answer my own question. > > Yes, you can for a default H2 database, at least we have not had a > catastrophic error since doing the following. > > ----------------- > In our case, all the flows were still in flow_storage AND nothing was lost > and the buckets were still defined in the H2 database. > > Before you begin, you will need to go into the flow_storage and figure out > what you have: > > # find ./8f0d9b79-b329-43d6-ad65-2a0db627492e -type f > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/6/6.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/3/3.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/9/9.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/10/10.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/5/5.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/4/4.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/8/8.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/2/2.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/7/7.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/1/1.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/11/11.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/3/3.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/5/5.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/4/4.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/2/2.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/1/1.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/afa05554-1d56-4ec8-b09c-cd1d0001dc70/3/3.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/afa05554-1d56-4ec8-b09c-cd1d0001dc70/2/2.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/afa05554-1d56-4ec8-b09c-cd1d0001dc70/1/1.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/5300c691-dbbf-4f7f-9f02-49c1c01f0188/2/2.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/5300c691-dbbf-4f7f-9f02-49c1c01f0188/1/1.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/88f6d53c-6066-4fd1-b2f9-9f97c9068124/1/1.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/2733df97-2973-430b-8be3-5a68b779021c/2/2.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/2733df97-2973-430b-8be3-5a68b779021c/1/1.snapshot > > ./8f0d9b79-b329-43d6-ad65-2a0db627492e/0c0045dd-1904-423e-83b0-230543bfc556/1/1.snapshot > > The first UUID is the bucket, the second is the flow. > > To get the name of the flow from the file, use the following python script: > > import json > import sys > fn = sys.argv[1] > with open(fn) as fi: > jdata = json.load(fi) > print(fn,jdata['content']['flowSnapshot']['flowContents']['name']) > > And run: > > cd ./flow_storage > > for f in `find . -type f` > do > python3 /tmp/getname.py ${f} > done > > This will print out all the names > > --------- > > Download the h2 jar binary: > > wget > https://search.maven.org/remotecontent?filepath=com/h2database/h2/2.3.230/h2-2.3.230.jar > mv > 'remotecontent?filepath=com%2Fh2database%2Fh2%2F2.3.230%2Fh2-2.3.230.jar' > h2.jar > > Access the H2 datastore: > > cd /data/nifi-registry # this is the directory that CONTAINS the database > directory > > Get into the H2 shell: > > java -cp ~/h2.jar org.h2.tools.Shell > > You will be prompted for a URL. Unless you have changed the default > (nifi.registry.db.url), it should be this: > > jdbc:h2:./database/nifi-registry-primary > > From the shell you can view the tables: > > SELECT * FROM INFORMATION_SCHEMA.TABLES; > > > In our case, the buckets were all present and correct, so we did not have > to deal with those but the process is likely similar. > > > You will be working with 3 tables: > > BUCKET_ITEM > FLOW > FLOW_SNAPSHOT > > > (The order is important) > Run the following for each snapshot to reinsert into the metadata database: > > SLQ1: > insert into bucket_item > (ID,NAME,DESCRIPTION,CREATED,MODIFIED,ITEM_TYPE,BUCKET_ID) values > ('0c0045dd-1904-423e-83b0-230543bfc556','<NAME from the script>','<ADD an > appropriate comment>','2024-07-01 00:00:00.000','2024-07-01 > 00:00:00.000','FLOW','8f0d9b79-b329-43d6-ad65-2a0db627492e'); > > In this case, the created and updated times are the same, you can modify > these as you see fit. At this point, we just needed them back in, loss of > the exact times was not relevant. > > ID - UUID of the flow > NAME - Name of the flow, we extracted these from the flow snapshots using > the script above > DESCRIPTION - Description that will appear in registry > CREATED - Time of creation > MODIFIED - Time of last mod > ITEM_TYPE - 'FLOW' (we are not aware of any other values) > BUCKET_ID - UUID of the bucket > > > SQL2: > insert into flow (ID) values ('0c0045dd-1904-423e-83b0-230543bfc556'); > > SQL3: > insert into flow_snapshot (FLOW_ID,VERSION,CREATED,CREATED_BY,COMMENTS) > values ('0c0045dd-1904-423e-83b0-230543bfc556',1,'2024-07-01 > 00:00:00.000','<insert an ID for the person who modified this, e.g. an > email>','<Insert a comment: e.g. Restore version 1>'); > > VERSION - Integer, should match the integer in the directory for that > snapshot > > > > > On Mon, Jul 29, 2024 at 5:56 PM David Early <[email protected]> > wrote: > >> We have a registry (1.23.2) that WAS working but for some reason has lost >> all the metadata in the DB. I was able to get into the H2 DB and see that >> we do not have any flow information....it just isn't there. >> >> However the flows are in the flow_storage directory (flow persistence is >> the directory, not git). >> >> We can find no evidence that the process was restarted or otherwise >> manipulated, just that the data is now gone. >> >> This is a bit of a problem because this is used to bridge configs from >> one system to another, so I have 2 NiFi instances that reference the same >> config IDs. >> >> Is there a way to restore the flows from flow_storage that would preserve >> the IDs/versions and allow the existing NiFi systems to see the flows? >> >> This was ironically noticed today when we went in to configure a regular >> backup of the registry DB and flows (we are new to the registry). >> >> Dave >> > > > -- > David Early, Ph.D. > [email protected] > 720-470-7460 Cell > > -- David Early, Ph.D. [email protected] 720-470-7460 Cell
