I didnt think of the niceties that databases offer so far, but that looks useful indeed.

A database does not have to be PostgreSQL. It can be simple and shareable. It can be an Emacs hash table. It can be an XML file. It can be an Org file. It can be a SQLite file in your git repository. It does not matter.

What matters is that there is a centralized place for the metadata. A single source of truth. A place where you can store arbitrary metadata—not just a limited set of fields that fit in a link syntax.

Inserting limited metadata is not arbitrary metadata. If you can only fit a few fields in the link, you are not supporting arbitrary metadata. You are supporting a limited set of fields that someone decided were important.

Arbitrary metadata means any field, any value, any structure, any relationship. That cannot fit in a link. That requires a data store. That requires a database—whether it is PostgreSQL, SQLite, or an Org file with properties.

Let me explain what kind of link-based workflow I have in mind before saying what system is best suited. My main focus is to make links reliable, in the sense that when you follow a link, it always just works; in linkin-org, this is ensured with a decentralized id system etc, but that's irrelevant for the current matter (check out https://github.com/Judafa/linkin-org to know more). From the user's point of view, it should feel as if you merged org-mode with dired as your links (with file type) are access points as reliable as a file listed in dired.

That means that principle I have explained fit your own philosophy perfectly.

I am just not sure if you are saying one thing, but not following it or not seeing.

It is architectural solution that links never break to have unique IDs and any changes of link are managed centralized. That way your Org links do not break. Let us say if website link is not valid any more, you could edit centralized place of links (call it database), and then all the Org links could point to archived version of the link, user doesn't change each particular link in many Org files.

I do not know about you, but I had thousands of Org files, one for each person. Imagine the problem to edit each single link in those files. It is quite different when you use centralized links and just reference them by UUID or ID.

For a link (with file type) to work, you just need two things: the link and the file.
Makes sense, trivial, and that's meant to be this way.

Trivial for small file, single user, who has time for thinking without practicing.

For any user beyond that, but even much below power users, that system you are proposing is shallow.

Files can change. Is link going to work always?

If files are indexed in the database and you move them in the database, then file system is forgotten, and links will always work that way (for as long as hard disk and computer works).

If files do not have metadata (indexing) then when file is renamed or moved, the link is gone and destroyed.

Solution to keep links working long term is to have indexed list of links and to reference them by their ID.

By extension, I see any third-party, centralized intermediary as a weakness towards reliable links.

I don't know what you mean with third parties. I have not suggested having a third party. Having central place of links means that you have to own and control your central place of links. That is not third party.

If you do not own the hard disk and indexing system of third parties, such as archive.org then you are more subject to lose reliable links.

Correct me if I'm wrong, but in case the database is lost/corrupted/non-updated, then the links may not work anymore.

Sure, but same can be said for your files, if they are lost, corrupted, you will not have reliable links, that is not context of the principle explained.

And this would be the worst outcome: you're left with years of notes full of links that are now useless.

Same thing if your hard disk get corrupted.

Backing up the database is far more simpler than backing up the file system.

The issue of backing up your digital data is not really the subject, isn't it?

Arbitrary data in links is fully solved on my side. I told you I have unlimited information on links, truly arbitrary information, and I am using the principle explained.

I'm primarily targeting a fully decentralized system, which fits best with the philosophy of org mode imo.

I am trying to understand how "decentralized system" fit into arbitrary metadata, but okay, maybe you mean you wish to have each link not centrally indexed, and so all information stored in the simple link.

Arbitrary data means unlimited fields, values, structures, relationships, so that cannot fit into link, or in the filename, or in plain text file, that needs data store.

This comes with a whole bunch of desirable niceties: you can move/update your linked file anywhere (dropbox app on your phone, whatever) without notifying a database, the link still works.

That relates to file synchronization, not a knowledge management workflow.

Dynamic Knowledge Repository is invented by Doug Engelbart, so you should look into his work. He has invented it exactly for the same purpose you are describing. Maybe it is lot to read?

But let me see practically, if I have my database, and it is accessible, then I can move my files anywhere I wish and want and links will still work.

I would say it may even combine neatly: one can use the id of a file in last resort to make sure the database cannot lose track of the file.

For those files over several bytes, I keep their index with hashes. So re-indexing would find those files. That is another architectural principle that you could lose so that your links stay always same, even if files is moved on file system or renamed, as long as it is not changed.

And if you wish to change the file, then maybe such change should occur over the layer of the index, so that file is first recognized, changed, and re-indexed after the change. That way you could arbitrary change files and have links still working as long as you use their IDs.

PS: Some more minor remarks:
-
Link creation or capturing should be instant. 1-3 seconds.
The link is created automatically, obviously.
- with a database, no easy way to share a data afaict. Otoh if the metadata is inside the link, then that's as simple as paste the link, put the file in attachement, send the email.

Sharing with a database is trivial. You export the object to a self-contained format—JSON, XML, or even an Org property drawer—and send it. The recipient imports it into their own database. The link remains the same. The metadata remains intact. The relationships remain preserved.

Alternatively, you can embed the metadata in the link at export time. The database stores the metadata. The link is the identifier. When you export for sharing, you expand the metadata into the link. When you import, you extract it back into the database.

This gives you the best of both worlds: your internal system is queryable, versioned, and scalable, and your shared links are self-contained.

Your system cannot do the reverse. Once the metadata is embedded in the link, you cannot extract it into a database without parsing every link. You are stuck with embedded metadata forever.

Just as you would need to "attach the file" to share the link inside of Org file, so I also need to press few buttons and I could export whatever links and share with people.

--
Jean Louis

Reply via email to