and apologies for the full email signature...

Dave

On Tue, Jul 30, 2024 at 12:49 PM David Early <[email protected]>
wrote:

> I am going to answer my own question.
>
> Yes, you can for a default H2 database, at least we have not had a
> catastrophic error since doing the following.
>
> -----------------
> In our case, all the flows were still in flow_storage AND nothing was lost
> and the buckets were still defined in the H2 database.
>
> Before you begin, you will need to go into the flow_storage and figure out
> what you have:
>
> # find ./8f0d9b79-b329-43d6-ad65-2a0db627492e -type f
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/6/6.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/3/3.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/9/9.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/10/10.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/5/5.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/4/4.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/8/8.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/2/2.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/7/7.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/1/1.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/622e1509-3a65-420d-95be-4b8214e65f3c/11/11.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/3/3.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/5/5.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/4/4.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/2/2.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/9e2fce9a-26c4-4b1c-8139-8ae0f828313f/1/1.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/afa05554-1d56-4ec8-b09c-cd1d0001dc70/3/3.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/afa05554-1d56-4ec8-b09c-cd1d0001dc70/2/2.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/afa05554-1d56-4ec8-b09c-cd1d0001dc70/1/1.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/5300c691-dbbf-4f7f-9f02-49c1c01f0188/2/2.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/5300c691-dbbf-4f7f-9f02-49c1c01f0188/1/1.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/88f6d53c-6066-4fd1-b2f9-9f97c9068124/1/1.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/2733df97-2973-430b-8be3-5a68b779021c/2/2.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/2733df97-2973-430b-8be3-5a68b779021c/1/1.snapshot
>
> ./8f0d9b79-b329-43d6-ad65-2a0db627492e/0c0045dd-1904-423e-83b0-230543bfc556/1/1.snapshot
>
> The first UUID is the bucket, the second is the flow.
>
> To get the name of the flow from the file, use the following python script:
>
> import json
> import sys
> fn = sys.argv[1]
> with open(fn) as fi:
>   jdata = json.load(fi)
> print(fn,jdata['content']['flowSnapshot']['flowContents']['name'])
>
> And run:
>
> cd ./flow_storage
>
> for f in `find . -type f`
> do
>   python3 /tmp/getname.py ${f}
> done
>
> This will print out all the names
>
> ---------
>
> Download the h2 jar binary:
>
> wget
> https://search.maven.org/remotecontent?filepath=com/h2database/h2/2.3.230/h2-2.3.230.jar
> mv
> 'remotecontent?filepath=com%2Fh2database%2Fh2%2F2.3.230%2Fh2-2.3.230.jar'
> h2.jar
>
> Access the H2 datastore:
>
> cd /data/nifi-registry # this is the directory that CONTAINS the database
> directory
>
> Get into the H2 shell:
>
> java -cp ~/h2.jar org.h2.tools.Shell
>
> You will be prompted for a URL.  Unless you have changed the default
> (nifi.registry.db.url), it should be this:
>
> jdbc:h2:./database/nifi-registry-primary
>
> From the shell you can view the tables:
>
> SELECT * FROM INFORMATION_SCHEMA.TABLES;
>
>
> In our case, the buckets were all present and correct, so we did not have
> to deal with those but the process is likely similar.
>
>
> You will be working with 3 tables:
>
> BUCKET_ITEM
> FLOW
> FLOW_SNAPSHOT
>
>
> (The order is important)
> Run the following for each snapshot to reinsert into the metadata database:
>
> SLQ1:
> insert into bucket_item
> (ID,NAME,DESCRIPTION,CREATED,MODIFIED,ITEM_TYPE,BUCKET_ID) values
> ('0c0045dd-1904-423e-83b0-230543bfc556','<NAME from the script>','<ADD an
> appropriate comment>','2024-07-01 00:00:00.000','2024-07-01
> 00:00:00.000','FLOW','8f0d9b79-b329-43d6-ad65-2a0db627492e');
>
> In this case, the created and updated times are the same, you can modify
> these as you see fit.  At this point, we just needed them back in, loss of
> the exact times was not relevant.
>
> ID - UUID of the flow
> NAME - Name of the flow, we extracted these from the flow snapshots using
> the script above
> DESCRIPTION - Description that will appear in registry
> CREATED - Time of creation
> MODIFIED - Time of last mod
> ITEM_TYPE - 'FLOW' (we are not aware of any other values)
> BUCKET_ID - UUID of the bucket
>
>
> SQL2:
> insert into flow (ID) values ('0c0045dd-1904-423e-83b0-230543bfc556');
>
> SQL3:
> insert into flow_snapshot (FLOW_ID,VERSION,CREATED,CREATED_BY,COMMENTS)
> values ('0c0045dd-1904-423e-83b0-230543bfc556',1,'2024-07-01
> 00:00:00.000','<insert an ID for the person who modified this, e.g. an
> email>','<Insert a comment: e.g. Restore version 1>');
>
> VERSION - Integer, should match the integer in the directory for that
> snapshot
>
>
>
>
> On Mon, Jul 29, 2024 at 5:56 PM David Early <[email protected]>
> wrote:
>
>> We have a registry (1.23.2) that WAS working but for some reason has lost
>> all the metadata in the DB.  I was able to get into the H2 DB and see that
>> we do not have any flow information....it just isn't there.
>>
>> However the flows are in the flow_storage directory (flow persistence is
>> the directory, not git).
>>
>> We can find no evidence that the process was restarted or otherwise
>> manipulated, just that the data is now gone.
>>
>> This is a bit of a problem because this is used to bridge configs from
>> one system to another, so I have 2 NiFi instances that reference the same
>> config IDs.
>>
>> Is there a way to restore the flows from flow_storage that would preserve
>> the IDs/versions and allow the existing NiFi systems to see the flows?
>>
>> This was ironically noticed today when we went in to configure a regular
>> backup of the registry DB and flows (we are new to the registry).
>>
>> Dave
>>
>
>
> --
> David Early, Ph.D.
> [email protected]
> 720-470-7460 Cell
>
>

-- 
David Early, Ph.D.
[email protected]
720-470-7460 Cell

Reply via email to