I am going to answer my own question.

Yes, you can for a default H2 database, at least we have not had a
catastrophic error since doing the following.

-----------------
In our case, all the flows were still in flow_storage AND nothing was lost
and the buckets were still defined in the H2 database.

Before you begin, you will need to go into the flow_storage and figure out
what you have:

# find ./8f0d9b79-b329-43d6-ad65-2a0db627492e -type f
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/6/6.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/3/3.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/9/9.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/10/10.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/5/5.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/4/4.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/8/8.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/2/2.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/7/7.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/1/1.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/11/11.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/3/3.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/5/5.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/4/4.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/2/2.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/1/1.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/afa05554-1d56-4ec8-b09c-cd1d0001dc70/3/3.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/afa05554-1d56-4ec8-b09c-cd1d0001dc70/2/2.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/afa05554-1d56-4ec8-b09c-cd1d0001dc70/1/1.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/5300c691-dbbf-4f7f-9f02-49c1c01f0188/2/2.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/5300c691-dbbf-4f7f-9f02-49c1c01f0188/1/1.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/88f6d53c-6066-4fd1-b2f9-9f97c9068124/1/1.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/2733df97-2973-430b-8be3-5a68b779021c/2/2.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/2733df97-2973-430b-8be3-5a68b779021c/1/1.snapshot
./8f0d9b79-b329-43d6-ad65-2a0db627492e/0c0045dd-1904-423e-83b0-230543bfc556/1/1.snapshot

The first UUID is the bucket, the second is the flow.

To get the name of the flow from the file, use the following python script:

import json
import sys
fn = sys.argv[1]
with open(fn) as fi:
  jdata = json.load(fi)
print(fn,jdata['content']['flowSnapshot']['flowContents']['name'])

And run:

cd ./flow_storage

for f in `find . -type f`
do
  python3 /tmp/getname.py ${f}
done

This will print out all the names

---------

Download the h2 jar binary:

wget
https://search.maven.org/remotecontent?filepath=com/h2database/h2/2.3.230/h2-2.3.230.jar
mv
'remotecontent?filepath=com%2Fh2database%2Fh2%2F2.3.230%2Fh2-2.3.230.jar'
h2.jar

Access the H2 datastore:

cd /data/nifi-registry # this is the directory that CONTAINS the database
directory

Get into the H2 shell:

java -cp ~/h2.jar org.h2.tools.Shell

You will be prompted for a URL.  Unless you have changed the default
(nifi.registry.db.url), it should be this:

jdbc:h2:./database/nifi-registry-primary

>From the shell you can view the tables:

SELECT * FROM INFORMATION_SCHEMA.TABLES;


In our case, the buckets were all present and correct, so we did not have
to deal with those but the process is likely similar.


You will be working with 3 tables:

BUCKET_ITEM
FLOW
FLOW_SNAPSHOT


(The order is important)
Run the following for each snapshot to reinsert into the metadata database:

SLQ1:
insert into bucket_item
(ID,NAME,DESCRIPTION,CREATED,MODIFIED,ITEM_TYPE,BUCKET_ID) values
('0c0045dd-1904-423e-83b0-230543bfc556','<NAME from the script>','<ADD an
appropriate comment>','2024-07-01 00:00:00.000','2024-07-01
00:00:00.000','FLOW','8f0d9b79-b329-43d6-ad65-2a0db627492e');

In this case, the created and updated times are the same, you can modify
these as you see fit.  At this point, we just needed them back in, loss of
the exact times was not relevant.

ID - UUID of the flow
NAME - Name of the flow, we extracted these from the flow snapshots using
the script above
DESCRIPTION - Description that will appear in registry
CREATED - Time of creation
MODIFIED - Time of last mod
ITEM_TYPE - 'FLOW' (we are not aware of any other values)
BUCKET_ID - UUID of the bucket


SQL2:
insert into flow (ID) values ('0c0045dd-1904-423e-83b0-230543bfc556');

SQL3:
insert into flow_snapshot (FLOW_ID,VERSION,CREATED,CREATED_BY,COMMENTS)
values ('0c0045dd-1904-423e-83b0-230543bfc556',1,'2024-07-01
00:00:00.000','<insert an ID for the person who modified this, e.g. an
email>','<Insert a comment: e.g. Restore version 1>');

VERSION - Integer, should match the integer in the directory for that
snapshot




On Mon, Jul 29, 2024 at 5:56 PM David Early <david.ea...@grokstream.com>
wrote:

> We have a registry (1.23.2) that WAS working but for some reason has lost
> all the metadata in the DB.  I was able to get into the H2 DB and see that
> we do not have any flow information....it just isn't there.
>
> However the flows are in the flow_storage directory (flow persistence is
> the directory, not git).
>
> We can find no evidence that the process was restarted or otherwise
> manipulated, just that the data is now gone.
>
> This is a bit of a problem because this is used to bridge configs from one
> system to another, so I have 2 NiFi instances that reference the same
> config IDs.
>
> Is there a way to restore the flows from flow_storage that would preserve
> the IDs/versions and allow the existing NiFi systems to see the flows?
>
> This was ironically noticed today when we went in to configure a regular
> backup of the registry DB and flows (we are new to the registry).
>
> Dave
>


-- 
David Early, Ph.D.
david.ea...@grokstream.com
720-470-7460 Cell

Reply via email to