Colleagues

I want to look at the bigger picture here. I apologise again for
another long email. There are many issues here that this community has
ignored for too long. So I hope some of you will at least read through
to the end, think about what I say and comment...maybe even support
the general idea...

Although this has been a discussion with only a handful of people it
has raised some interesting points. Many followers may have missed the
significance of some of these points or perhaps not thought deeply
about them. These include (in no particular order):
-Different registration requirements for IPv4 and IPv6
-Differences in the way IPv4 and IPv6 have been allocated and assigned over time
-Block size (fixed or random)
-Retro fitting of features
-Different levels of adherence to policy by resource holders
-Voluntary nature of supplying some details
-No consistent approach to supplied data
-Confusion for some resource holders about what data to publish
-Effort required to maintain data in the RIPE Database
-Volatility of some fast changing data
-Privacy
-Customer confidentiality
-Public interest
-Public registry
-Registering public networks
-Addresses defined as free text (sometimes including name)

This is a lot of issues wrapped around one policy proposal. This
proposal will not address all, or even most, of these issues. I don't
believe this is the right way forward. But what is the root problem
here and how can we address it?

There are also some other points to consider. At recent RIPE Meetings
some prominent members of this community have told me in the strongest
possible terms that there is no way in hell that they are going to
list any of their customer's details in the public RIPE Database. No
matter what any policy says. Commercial confidentiality seems to be a
very sensitive issue for some resource holders. Of course this is a
valid concern. But it needs to be balanced. Policy needs consensus,
but when we have a consensus all resource holders must follow it. That
is the only way a self regulating industry can work.

Another reason of concern is the alignment of handling both IPv4 and
IPv6 registrations in the RIPE Database. Where we have two systems
that are managed in different ways, there are of course two ways they
can be aligned. We can dumb down the IPv4 data to the level of IPv6.
Or we can raise the IPv6 data to the level of IPv4. Everyone is
focused on the dumbing down option. No one has even considered moving
in the other direction. I have never understood why the IPv6
registration policy was not written with the same requirements in mind
as the IPv4 in the first instance. Maybe at the time the automation
options available then were not as extensive as they are today.
Computer power and bandwidth were certainly not comparable to what
they are today. Changes to the RIPE Database data model, interfaces,
technology and design would make it possible to raise the level of
IPv6 information available in the public registry to the same level as
IPv4.

At the heart of this issue is a public registry. But what is that in
2023? What does it mean? What should be in it? Who is it for? How do
we achieve a three way balance between commercial sensitivity, public
need and privacy? These are the sort of questions I was hoping the
RIPE Database Requirements Task Force would answer when they started
their work. The end result was a little disappointing. They didn't
answer any of these questions. They focussed most of their attention
looking backwards. Many of us know the history. We want to know how to
move forwards. These types of proposals are not the right way forward.
So where should we be heading? I believe we need a new Task Force to
do what I thought the last one would do. To determine the business
requirements for the RIPE Database as a public registry in the 2020s
and beyond. To answer these fundamental questions. To establish the
registration requirements for a public registry that we can have a
consensus on and everyone will accept and apply.

Daniel said at the BOFF in Iceland, "It's time to stop tinkering
around the edges of the RIPE Database". But that is exactly what these
policy proposals are doing. Here we are trying to retrofit an IPv6
construct onto IPv4. Straight away assignment-size had to be dropped
as it won't fit with the way IPv4 assignments are made or how they
could be retrospectively aggregated. Knowing the blocksize has nothing
to do with HD ratios and further allocations. It tells you nothing
about how many assignments have been made from the aggregate, 1 or
100. It exists for IPv6 for other reasons. The same reasons we need
for IPv4 but can't achieve, because the two systems are not the same.

We need to start with a full, forward looking Business Requirements
document for the RIPE Database, based on accepted business analysis
procedures. We can follow that with a Technical Requirements document
outlining how things should be done. Not at the level of defining
technology or software design, that is for the NCC engineer's to
determine. This should include the outline design of the data model
and interfaces to commercial IPAM systems. Syncing bits of your
internal data, as defined necessary for a public registry, with a
database really isn't the problem in 2023. There should be no labour
intensive work here. It doesn't matter if the RIPE Database has 5m or
50m or 500m assignment data sets in it. As long as they contain the
data defined by the requirements to serve as a balanced public
registry. No one should be manually entering this data. No one is
going to read this data. We can build tools to provide information
from this data in a human understandable format. In terms of
registration requirements there should be little or no distinction
between IPv4 and IPv6. But that doesn't mean we take the lowest common
level.

In case anyone is in any doubt, I am suggesting a redesign and rebuild
of the RIPE Database, based on an updated understanding of what is
needed to maintain and operate a public registry for all stakeholders.
I know none of the RIPE community nor the RIPE WG chairs nor the RIPE
NCC membership (who pay for it) nor the RIPE NCC executive board or
senior management has any appetite for this. In the past whenever I
have brought up this subject I have been totally ignored. Replying to
emails where I have mentioned this, people have noticeably answered
other points and cut out any reference to redesigning the RIPE
Database. Many people have gone to extraordinary lengths to avoid even
having this conversation. Seriously guys, the time has come to have
this conversation. Daniel tried to start it at that BOFF. The RIPE
community has just let it drop...again.

The current design of the RIPE Database data model and software is
about 25 years old. It was a big waterfall project with a big bang
release and switch over from version 2 in April 2001. Aspects of the
design, including having all data stored in untouched, human readable,
text blocks, even predates this. We have had two major rewrites of the
software in this time in C and then java. But the underlying design
was not changed at all. Much of it is no longer fit for purpose. This
attempt to retro fit aggregations from IPv6 to IPv4 highlights some of
the cracks. It gets harder and harder to make significant changes to
this system over time. Like assigning a whole allocation which cuts to
the core of the software design and data model. Just to make this one
change would be a very disruptive process for all users. Even if we
decide today to set up a new task force to determine the business
requirements, then the technical requirements, then redesign and
rebuild in small agile chunks, we won't have a new system for at least
5 years. By then we are working with a 30 year old data model and
system design. That is the age of dinosaurs in the IT world. Do we
really want to wait until it breaks before we do anything? Calm,
collective consideration is a better working model than panic,
reactive mode. We are long overdue for this.

It does not need to be done again in one huge step. It can be done
incrementally. Use agile not waterfall methods. The whole system can
be easily broken down into subsystems which can be worked on
independently and deployed without massive disruption. I'll give some
of my own thoughts and ideas on how some of this can be done.

Task Force 1 to determine the business requirements of the RIPE
Database as a public registry.

Task Force 2 to determine the technical requirements of the RIPE
Database as a public registry.

Redesigned data model dropping the old fashioned requirement to have
all data stored in untouched text blocks and be human readable. Stored
data should be machine parsable and processable. Tools and interfaces
can be provided to offer information based on the stored data or raw
data for further machine processing.

Accommodate new business models including the acceptance of investors
and commercial RIRs operating below the RIPE NCC.

Interfaces to commercial IPAM systems so all the required data can be
uploaded and synced without human effort.

Expand the LIR Portal to a system of user accounts for anyone who
enters data into the database and identified/verified power users who
consume the data.

Notifications are basically an audit trail of changes to your data.
This should be configured through the user accounts. No need for it to
be spread throughout the entire database at the data set level. There
are millions of attributes with duplicated email addresses all over
the data. This has no public interest value at all and should not be
public data.

We should design a new authorisation and authentication scheme, also
configured through the user accounts. Again details about the security
of your data have no public interest value and should not be public
data. I don't know of any other web based system that publishes so
much information about how you secure and protect your data.

The basic data is composed of hierarchical sets of IP addresses. But
only abuse contacts use inheritance. All contact and management data
should be inherited. That again could remove millions of items of
duplicated, redundant data. Structure of contact and identification
data should also be redesigned with privacy and confidentiality in
mind.

Resource holder and End User name and address details should be
properly formatted rather than free text.

Requirements for user registration details in a public registry could
be re-evaluated and re-designed from the bottom up with a three way
balance of privacy, confidentiality and public interest in mind.

Language and characterisation of data can be re-evaluated for the
whole data set.

Routing data could be better structured with usage in mind. Tools
could be built in to provide the structured data needed by those who
use this data.

Geolocation data could be built in rather than relying on external files.

Basic, anonymous queries could be limited to bare bones data with no
PII. More detailed data could be provided only to verified query
users, with accounts, with different levels of detail.

Historical data could be subject to a one time post processing to
remove PII from public view but still allow anonymised cross
referencing that researchers and investigators can do now with the PII
data.

The whole dataset should be organisation centric. Every piece of data
entered into the database should be directly or indirectly linked to
an organisation described in the dataset. There is no reason to allow
anonymous or orphaned data to be entered.

All changes of this nature could be made independently and gradually
introduced. But we do need a road map based on a bigger picture so we
know where we are heading. Especially for the core changes.



If there is one thing I want you to consider from this message it is this:
id nunc, aliquo tempore postea fit numquam

I am not well know for my language skills so let me say it in English:
do it now, sometime later becomes never

cheers
denis
co-chair DB-WG

-- 

To unsubscribe from this mailing list, get a password reminder, or change your 
subscription options, please visit: 
https://lists.ripe.net/mailman/listinfo/address-policy-wg

Reply via email to