Hi,

I just wanted to clarify my stance on validation a bit more.

I am totally against trying to validate the data itself, that is not
what the NCC is supposed to do.
Validating the format of the CSV might be okay but honestly anything
beyond validating that it is not a 404 not found is a bit too much in
my opinion.

I also agree with Leo's points with regards to fixing the data, I
believe that the data publishers have a pretty strong incentive to
have the data be accurate.
And as Leo also mentions, the tech-c and/or admin-c contacts are also
published so finding a reporting mechanism for issues would not be
very difficult.

And with regards to misformatted data, yeah I would probably just
ignore that entry if I was writing a parser and log the error and
report it to an engineer who can then forward it to the admin contact
if they determine it to be a real issue.

In order to not infinitely delay this, I feel like while it shouldn't
be rushed, I am not sure how realistic this issue would be and how
much harm it would cause to anyone.

Also, changing how much validation is done could be changed in the
future if it is shown to be an actual real world problem.

-Cynthia

On Wed, Apr 7, 2021 at 10:58 PM Leo Vegoda <l...@vegoda.org> wrote:
>
> Hi Denis,
>
> This message is in response to several in the discussion .
>
> In brief: I have seen network operators distraught because their
> network was misclassified as being in the wrong geography for the
> services their customers needed to access and they had no way to fix
> that situation. I feel that publishing geofeed data in the RIPE
> Database would be a good thing to do as it helps network operators
> share data in a structured way and should reduce the overall amount of
> pain from misclassified networks.
>
> I personally would like to see an agreement on your draft problem
> statement and some feedback from the RIPE NCC before focusing on some
> of the more detailed questions you raised.
>
> I also agree with you that accurate and reliable data is important. But...
>
> On Wed, Apr 7, 2021 at 7:19 AM denis walker via db-wg <db-wg@ripe.net> wrote:
>
> [...]
>
> > You say most consumers of this geofeed data
> > will be software capable of validating the csv file. What will this
> > software do when it finds invalid data? Just ignore it? Will this
> > software know who to report data errors to? Will it have any means to
> > follow up on reported errors?
>
> I would have thought that anyone implementing a parser for this data
> would also be able to query the database for a tech-c and report
> validation failures. Based on my previous interactions with the
> network operators who have suffered misclassification, I am confident
> that there is a strong incentive for networks to publish well
> formatted accurate data and to fix any errors quickly.
>
> That said, there are many possible ways to reduce the risk of badly
> formatted data. For instance, the RIPE NCC could offer a tool to
> create the relevant files to be published through the LIR Portal or as
> a standalone tool. This is why I'd like to see feedback from the RIPE
> NCC ahead of an implementation discussion.
>
> > Services like geofeed are good ideas. But if the data quality or
> > accessibility deteriorates over time it becomes useless to misleading.
> > That is why I believe centralised validating, testing and reporting
> > are helpful. I think the RIRs are well positioned for doing these
> > tasks and should do more of them.
>
> I agree with you that defining what data means and keeping it accurate
> is important. But in the case of geo data, could the RIPE NCC validate
> the content as well as the data structures? I'd have thought that the
> publishers and the users of the data would be in the best position to
> do that. Am I wrong?
>
> Kind regards,
>
> Leo

Reply via email to