Re: [PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
Max Nikulin writes: >> I do not think that it is a good idea. We currently promise to use the >> first function in the list to generate the IDs. Other functions are >> mostly a fallback to catch the existing attach dirs created in the past >> using different ID scheme. > > My idea is to add an extra check that filters out nil values. I do not > mind to signal a user error if first entry from > `org-attach-id-to-path-function-list' returns nil and paths returned by > other function do not exist. This would be possible, in addition to the patch above. Or do you mean the default ts/uuid format functions to return nil to short IDs as well? >> I guess that one option could be demanding a non-nil return value when >> generating a new attach dir (without TRY-ALL argument) and wrapping >> things into (ignore-errors ...) otherwise. > > I am against `ignore-errors' because it makes harder bug hunting using > debug on errors. Agreed. > I would prefer to force users to justify strategy for non-standard IDs > as early as possible. The goal is to avoid manual actions to change > directory structure when users have hundreds of folders. My experience > that users may easily face performance issues with ~1000 files per > directory. The reason is either lack of optimization for such case in > particular application or a bug introduced in a new version. I am not sure if we can solve this kind of issue reliably. The only idea is enforcing [1-9][0-9]\{5\} for ts IDs. And we cannot do much about IDs prefixed with the same constant string. -- Ihor Radchenko, Org mode contributor, Learn more about Org mode at https://orgmode.org/. Support Org development at https://liberapay.com/org-mode, or support my work at https://liberapay.com/yantar92
Re: [PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
On 13/08/2022 12:29, Ihor Radchenko wrote: Max Nikulin writes: I believe that interaction between `org-attach-dir-from-id' and members of `org-attach-id-to-path-function-list' could be improved. If a too short ID is passed to `org-attach-id-uuid-folder-format', `org-attach-id-ts-folder-format', or a user-defined function, they may return nil and `org-attach-dir-from-id' should try next element from the list. If nothing succeeds then a user error should be signaled demanding to implement a strategy to properly deal with peculiar IDs. I do not think that it is a good idea. We currently promise to use the first function in the list to generate the IDs. Other functions are mostly a fallback to catch the existing attach dirs created in the past using different ID scheme. My idea is to add an extra check that filters out nil values. I do not mind to signal a user error if first entry from `org-attach-id-to-path-function-list' returns nil and paths returned by other function do not exist. I guess that one option could be demanding a non-nil return value when generating a new attach dir (without TRY-ALL argument) and wrapping things into (ignore-errors ...) otherwise. I am against `ignore-errors' because it makes harder bug hunting using debug on errors. I would prefer to force users to justify strategy for non-standard IDs as early as possible. The goal is to avoid manual actions to change directory structure when users have hundreds of folders. My experience that users may easily face performance issues with ~1000 files per directory. The reason is either lack of optimization for such case in particular application or a bug introduced in a new version.
Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
On 12/08/2022 23:08, Janek F wrote: I use org-roam which creates links between org-files using the ID, and I would like to be able to manually add such links to common pages even if autocompletion doesn't work smoothly. Unfortunately I have not tried org-roam yet. My suggestion is to add CUSTOM_ID property in addition to ID. I admit that CUSTOM_ID may be inconvenient if a heading is moved from one file to another. Are there more problems with CUSTOM_ID links? Considering the other messages, let's not get lost in details, the default function does not need to consider all edge cases. It simply should not error out on ids in a different format. Janek, what is your expectation concerning directory structure for attachments in the case of customized ID values? You have not provided details concerning your configuration, but perhaps your issue may be solved by customizing `org-attach-id-to-path-function-list': keep `org-attach-id-uuid-folder-format' and drop `org-attach-id-ts-folder-format'. It should help unless you have single character IDs. I do not think Org behavior is ideal in your case. On the other hand the code was written with assumption that either UUID or timestamp-based IDs are used. It is violated by handcrafted IDs. I do not know the full story, but I guess, `org-attach-id-ts-folder-format' is present in default configuration for backward compatibility reasons. org-attach-id-ts-folder-format: Args out of range: "ftt", 0, 6 is an obscure error. However Org needs some hint from user how to handle such case.
Re: [PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
Max Nikulin writes: > was a really bad one. It leads to a separate directory for each short > ID. However I still believe that the purpose of > `org-attach-id-ts-folder-format' is avoid concentration of huge number > of file in a single directory. Distribution of attachments over > subdirectories is not perfectly even but it still works reasonably well > for long-lasting projects while IDs follow assumed format and month is > included into directory names. This is only true for uuid-style. Also, this is likely not the problem we need to worry about. To be a problem, the number of attach dirs should be really large, which is only relevant when thousands of attach dirs are cramped into a single parent directory. Thousands of attach dirs would probably correspond to tens of thousands of headings, which is not something we commonly have and, honestly, Org will have much more serious issues when the number of headings is that large. If someone ends up complaining about too many attach dirs, we can always ask to adjust the ID naming scheme. The previous paragraph appears to be consistent with > Back to the original problem. First of all, I believe that if a user > decided to use non-standard IDs then it is their responsibility to > customize `org-attach-id-to-path-function-list' and to provide a > function that implements alternative layout of attachment files. > Depending on expected amount, they may be put into single directory, > spread using some hash function, or accordingly to project structure. So > today I am against default fallback directories especially in > `org-attach-id-ts-folder-format'. --- > I believe that interaction between `org-attach-dir-from-id' and members > of `org-attach-id-to-path-function-list' could be improved. If a too > short ID is passed to `org-attach-id-uuid-folder-format', > `org-attach-id-ts-folder-format', or a user-defined function, they may > return nil and `org-attach-dir-from-id' should try next element from the > list. If nothing succeeds then a user error should be signaled demanding > to implement a strategy to properly deal with peculiar IDs. I do not think that it is a good idea. We currently promise to use the first function in the list to generate the IDs. Other functions are mostly a fallback to catch the existing attach dirs created in the past using different ID scheme. I guess that one option could be demanding a non-nil return value when generating a new attach dir (without TRY-ALL argument) and wrapping things into (ignore-errors ...) otherwise. > There should be no problem to put single character IDs into a common > directory (UUID case), but there are enough variants for 5 characters > long names to create delayed performance problem. The worst case is when > a user decides to use some common prefix, e.g. "id-" before UUID and all > files go to the same directory. On the other hand I do not think that > code should detect such cases. We should not worry about this. It will be the user responsibility to create the relevant custom function. -- Ihor Radchenko, Org mode contributor, Learn more about Org mode at https://orgmode.org/. Support Org development at https://liberapay.com/org-mode, or support my work at https://liberapay.com/yantar92
Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
Janek F writes: > Considering the other messages, let's not get lost in details, the default > function does not need to consider all edge cases. It simply should not error > out on ids in a different format. You are indeed right, but we may use multiple different approaches to achieve this. Some are more simplistic, some are less. We are trying to find a balance between less hassle for users and not overthinking things. The simplest approach would be wrapping the function calls into (ignore-errors ...), which would avoid errors when we just try to check the existence of the attachment dir. However, it may raise unexpected errors for non-standard/copy-pasted IDs for users who did not customize this area. The approach I am proposing is handling unexpected IDs inside the default functions and bringing errors in user-defined functions to the attention of user - I feel that it will be more useful than ignoring errors. We could use yet another approach and use a global fallback if no available function can generate an ID, but we may run into tricky cases because the first function in the list has a special meaning - it _must_ generate a path from ID or the user may run into unexpected breakage in some cases when custom IDs are being used. -- Ihor Radchenko, Org mode contributor, Learn more about Org mode at https://orgmode.org/. Support Org development at https://liberapay.com/org-mode, or support my work at https://liberapay.com/yantar92
Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
I use org-roam which creates links between org-files using the ID, and I would like to be able to manually add such links to common pages even if autocompletion doesn't work smoothly. Considering the other messages, let's not get lost in details, the default function does not need to consider all edge cases. It simply should not error out on ids in a different format. --- Original Message --- Max Nikulin schrieb am Freitag, 12. August 2022 um 17:35: > On 21/07/2022 02:12, Janek F wrote: > > > However I tend to customize IDs for important files by hand, > > causing any attempt to use org-attach on that file to fail > > if the ID is shorter than six characters: > > > > org-attach-id-ts-folder-format: Args out of range: "ftt", 0, 6 > > > > This method should be adjusted to handle non-ts-ids just as well, > > as org-id-method does not dictate the format of existing ids. > > > There are should be no such restriction for CUSTOM_ID links (referenced > using #anchor). IDs used for attachments assume particular format and > consistent layout of files. Janek, could you, please, provide some > details concerning your use case. I am interested why you decided to > avoid CUSTOM_ID for headings important to you. > > As I wrote yesterday, you can customize Org to use directories for > attachments you like.
Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
On 21/07/2022 02:12, Janek F wrote: However I tend to customize IDs for important files by hand, causing any attempt to use org-attach on that file to fail if the ID is shorter than six characters: org-attach-id-ts-folder-format: Args out of range: "ftt", 0, 6 This method should be adjusted to handle non-ts-ids just as well, as org-id-method does not dictate the format of existing ids. There are should be no such restriction for CUSTOM_ID links (referenced using #anchor). IDs used for attachments assume particular format and consistent layout of files. Janek, could you, please, provide some details concerning your use case. I am interested why you decided to avoid CUSTOM_ID for headings important to you. As I wrote yesterday, you can customize Org to use directories for attachments you like.
Re: [PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
On 11/08/2022 11:19, Ihor Radchenko wrote: Max Nikulin writes: I slightly dislike the "___xx" compared to "__" because it will create a proliferation of top-level folders as opposed to cramping the non-standard IDs into a single "__" folder. I believed that proliferation of folders is for purpose. Intermediate directories allows to avoid excessive number of files in single directory. ext4 with directory tree index usually is not the case, but other filesystems may have rather poor performance when too much files are stuffed into single folder. Some applications become really slow for huge directories. I was referring to TS style timestamp resolver here. It is designed to group directories by creation time, not to distribute them homogeneously. My bad, I have realizes that my idea of mapping "x" -> "__x/x" "xy" -> "_xy/xy" was a really bad one. It leads to a separate directory for each short ID. However I still believe that the purpose of `org-attach-id-ts-folder-format' is avoid concentration of huge number of file in a single directory. Distribution of attachments over subdirectories is not perfectly even but it still works reasonably well for long-lasting projects while IDs follow assumed format and month is included into directory names. Back to the original problem. First of all, I believe that if a user decided to use non-standard IDs then it is their responsibility to customize `org-attach-id-to-path-function-list' and to provide a function that implements alternative layout of attachment files. Depending on expected amount, they may be put into single directory, spread using some hash function, or accordingly to project structure. So today I am against default fallback directories especially in `org-attach-id-ts-folder-format'. I believe that interaction between `org-attach-dir-from-id' and members of `org-attach-id-to-path-function-list' could be improved. If a too short ID is passed to `org-attach-id-uuid-folder-format', `org-attach-id-ts-folder-format', or a user-defined function, they may return nil and `org-attach-dir-from-id' should try next element from the list. If nothing succeeds then a user error should be signaled demanding to implement a strategy to properly deal with peculiar IDs. There should be no problem to put single character IDs into a common directory (UUID case), but there are enough variants for 5 characters long names to create delayed performance problem. The worst case is when a user decides to use some common prefix, e.g. "id-" before UUID and all files go to the same directory. On the other hand I do not think that code should detect such cases.
Re: [PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
Max Nikulin writes: >>> Please, do not use path components starting with dash, it is terrible >>> for CLI tools. By the way, you promised underscores, not dashes. >> >> Why? > > It is inconvenient in interactive sessions (in scripts appropriate > measures must be taken anyway), the following does not list content of > the "--" directory: > > ls -l -- Thanks for the clarification. >> I slightly dislike the "___xx" compared to "__" because it will >> create a proliferation of top-level folders as opposed to cramping the >> non-standard IDs into a single "__" folder. > > I believed that proliferation of folders is for purpose. Intermediate > directories allows to avoid excessive number of files in single > directory. ext4 with directory tree index usually is not the case, but > other filesystems may have rather poor performance when too much files > are stuffed into single folder. Some applications become really slow for > huge directories. I was referring to TS style timestamp resolver here. It is designed to group directories by creation time, not to distribute them homogeneously. > On the other hand there are not so many distinct file names that may > fall into "__" directory, so perhaps it is OK. Yes. Such short IDs are not supposed to be common and even if they are, the maximum possible number of such IDs is relatively small. -- Ihor Radchenko, Org mode contributor, Learn more about Org mode at https://orgmode.org/. Support Org development at https://liberapay.com/org-mode, or support my work at https://liberapay.com/yantar92
Re: [PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
On 10/08/2022 20:23, Ihor Radchenko wrote: Max Nikulin writes: +the path." + (if (< (length id) 3) + (format "--/%s" id) Please, do not use path components starting with dash, it is terrible for CLI tools. By the way, you promised underscores, not dashes. Why? It is inconvenient in interactive sessions (in scripts appropriate measures must be taken anyway), the following does not list content of the "--" directory: ls -l -- I slightly dislike the "___xx" compared to "__" because it will create a proliferation of top-level folders as opposed to cramping the non-standard IDs into a single "__" folder. I believed that proliferation of folders is for purpose. Intermediate directories allows to avoid excessive number of files in single directory. ext4 with directory tree index usually is not the case, but other filesystems may have rather poor performance when too much files are stuffed into single folder. Some applications become really slow for huge directories. On the other hand there are not so many distinct file names that may fall into "__" directory, so perhaps it is OK. Concerning "unknown" I am less sure. I do not have strong opinion how to properly deal with short IDs.
[PATCH v3] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
Max Nikulin writes: > On 10/08/2022 18:43, Ihor Radchenko wrote: >> Ihor Radchenko writes: >> >> I have updated the patch to use "__/id" when id is too short. >> Any objections? >> >> +When ID is too short (less than 3 chars), use its md5 hash to create > > Misleading docs, you do not use md5. Oops. Will fix. >> +the path." >> + (if (< (length id) 3) >> + (format "--/%s" id) > > Please, do not use path components starting with dash, it is terrible > for CLI tools. By the way, you promised underscores, not dashes. Why? I have no opinion about the possible dummy folder name, except that it should fit the general pattern we already have "xx/..." or "MM/...". >> +(format "%s/%s" >> +(substring id 0 2) >> +(substring id 2 > > Ihor, I have not look into the code around, so my suggestions may have > no sense. > > Is it possible to pass empty string as ID to these functions? Should it > be explicitly checked? It should not be possible under normal operation. > What if ID contains "/" that can not be used in file name? Windows has > more forbidden characters. Then, creating the attachment dir will fail in front of the user. I am not sure if we really need to do much about this. At least, not until we get a bug report. If we really need to solve the edge cases with attach dir generation, we may need to change the overall design in org-attach/org-id to something more restrictive. I am not sure if it is worth the effort compared to other directions where to improve Org. > Do you expect any problem if here (and for timestamp-based ids) > directory component is just padded with some character (unsure what can > better than underscore) to required length? > > "x" -> "_x/x" or "__x/x" > "xy" -> "xy/xy" or "_xy/xy" > etc. > > From my point of view it might be a bit better than "__" and "unknown". Does it make sense? Timestamp-based IDs are like 20220810T210121.478800 by default. Then, the top-level folders will be like 202208/... If the actual format is different, we cannot possibly expect the timestamp to follow the same pattern, which is why I chose "unknown" referring to unknown date. On the other hand, if an ID has more than 6 characters, it will generate nonsense top-level folder with or without the patch. We could go further and match the ID against [1-9][0-9]\{5\} and put the folder into "unknown/ID" instead, but that would be a breaking change. In summary, I am more of less neutral towards this fallback format, except that it would be nice to be able to recover the ID from the directory path. Padding should be recoverable. I slightly dislike the "___xx" compared to "__" because it will create a proliferation of top-level folders as opposed to cramping the non-standard IDs into a single "__" folder. Attaching the updated patch. >From 50d0f9de0acdf5d67b797476816cbeb40b19f554 Mon Sep 17 00:00:00 2001 Message-Id: <50d0f9de0acdf5d67b797476816cbeb40b19f554.1660137585.git.yanta...@gmail.com> From: Ihor Radchenko Date: Sat, 23 Jul 2022 13:13:24 +0800 Subject: [PATCH v3] org-attach-dir-from-id: Do not rely on ID being over 6 chars long * lisp/org-attach.el (org-attach-id-uuid-folder-format): Fall back to "__/ID" when the ID contains 2 chars or less and cannot be split into the "xy/z" path. (org-attach-id-ts-folder-format): Fall back to "__/ID" path format when the ID contains less than 7 chars and cannot be split into the "MM/rest" path. Fixes https://orgmode.org/list/KC8PcypJapBpJQtJxM0kX5N7Z0THL2Lq6EQjBMzpw1-vgQf72egZ2JOIlTbPYiqAVD4MdSBhrhBZr2Ykf5DN1mocm1ANvvuKKZShlkgzKYM=@pm.me --- lisp/org-attach.el | 24 1 file changed, 16 insertions(+), 8 deletions(-) diff --git a/lisp/org-attach.el b/lisp/org-attach.el index fe49af6f3..0f5d5af82 100644 --- a/lisp/org-attach.el +++ b/lisp/org-attach.el @@ -159,19 +159,27 @@ (defcustom org-attach-archive-delete nil (defun org-attach-id-uuid-folder-format (id) "Translate an UUID ID into a folder-path. Default format for how Org translates ID properties to a path for -attachments. Useful if ID is generated with UUID." - (format "%s/%s" - (substring id 0 2) - (substring id 2))) +attachments. Useful if ID is generated with UUID. + +When ID is too short (less than 3 chars), return \"__/ID\"." + (if (< (length id) 3) + (format "__/%s" id) +(format "%s/%s" + (substring id 0 2) + (substring id 2 (defun org-attach-id-ts-folder-format (id) "Translate an ID based on a timestamp to a folder-path. Useful way of translation if ID is generated based on ISO8601 timestamp. Splits the attachment folder hierarchy into -year-month, the rest." - (format "%s/%s" - (substring id 0 6) - (substring id 6))) +year-month, the rest. + +When ID is too short (less than 7 chars), return \"__/ID\"." + (if (< (length id) 7) + (format "__/%s" id) +(format "%s/%s" + (substring id 0 6) + (substring id 6 (defcustom
Re: [PATCH v2] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
On 10/08/2022 18:43, Ihor Radchenko wrote: Ihor Radchenko writes: I have updated the patch to use "__/id" when id is too short. Any objections? +When ID is too short (less than 3 chars), use its md5 hash to create Misleading docs, you do not use md5. +the path." + (if (< (length id) 3) + (format "--/%s" id) Please, do not use path components starting with dash, it is terrible for CLI tools. By the way, you promised underscores, not dashes. +(format "%s/%s" + (substring id 0 2) + (substring id 2 Ihor, I have not look into the code around, so my suggestions may have no sense. Is it possible to pass empty string as ID to these functions? Should it be explicitly checked? What if ID contains "/" that can not be used in file name? Windows has more forbidden characters. Do you expect any problem if here (and for timestamp-based ids) directory component is just padded with some character (unsure what can better than underscore) to required length? "x" -> "_x/x" or "__x/x" "xy" -> "xy/xy" or "_xy/xy" etc. From my point of view it might be a bit better than "__" and "unknown".
[PATCH v2] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
Ihor Radchenko writes: >> Ihor, I am afraid of collisions due to short input to md5. Long hash >> value gives false impression of high entropy but actually there are not >> so many variants for 1 or 2 characters strings. Either there is no point >> in making long name from short id or random string of appropriate length >> should be used instead. > > Random is not an option. If we use random string, how will we get the > right attachment path on a second call? This function must return the > same value given the same input. > > Having said that, I do agree (taking into account the other comment) > that md5 may not be a very good idea. Maybe simply something like > "__/id". Ideally, we should be able to recover the original id from the > path to attachment. I have updated the patch to use "__/id" when id is too short. Any objections? >From 7ca5fe66c53a1399943a2e2d4a1675ca8cd14194 Mon Sep 17 00:00:00 2001 Message-Id: <7ca5fe66c53a1399943a2e2d4a1675ca8cd14194.1660131747.git.yanta...@gmail.com> From: Ihor Radchenko Date: Sat, 23 Jul 2022 13:13:24 +0800 Subject: [PATCH v2] org-attach-dir-from-id: Do not rely on ID being over 6 chars long * lisp/org-attach.el (org-attach-id-uuid-folder-format): Fall back to "--/ID" when the ID contains 2 chars or less and cannot be split into the "xy/z" path. (org-attach-id-ts-folder-format): Fall back to "unknown/ID" path format when the ID contains less than 7 chars and cannot be split into the "MM/rest" path. Fixes https://orgmode.org/list/KC8PcypJapBpJQtJxM0kX5N7Z0THL2Lq6EQjBMzpw1-vgQf72egZ2JOIlTbPYiqAVD4MdSBhrhBZr2Ykf5DN1mocm1ANvvuKKZShlkgzKYM=@pm.me --- lisp/org-attach.el | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/lisp/org-attach.el b/lisp/org-attach.el index fe49af6f3..bb7c787f7 100644 --- a/lisp/org-attach.el +++ b/lisp/org-attach.el @@ -159,19 +159,28 @@ (defcustom org-attach-archive-delete nil (defun org-attach-id-uuid-folder-format (id) "Translate an UUID ID into a folder-path. Default format for how Org translates ID properties to a path for -attachments. Useful if ID is generated with UUID." - (format "%s/%s" - (substring id 0 2) - (substring id 2))) +attachments. Useful if ID is generated with UUID. + +When ID is too short (less than 3 chars), use its md5 hash to create +the path." + (if (< (length id) 3) + (format "--/%s" id) +(format "%s/%s" + (substring id 0 2) + (substring id 2 (defun org-attach-id-ts-folder-format (id) "Translate an ID based on a timestamp to a folder-path. Useful way of translation if ID is generated based on ISO8601 timestamp. Splits the attachment folder hierarchy into -year-month, the rest." - (format "%s/%s" - (substring id 0 6) - (substring id 6))) +year-month, the rest. + +When ID is too short (less than 7 chars), return \"unknown/ID\"." + (if (< (length id) 7) + (format "unknown/%s" id) +(format "%s/%s" + (substring id 0 6) + (substring id 6 (defcustom org-attach-id-to-path-function-list '(org-attach-id-uuid-folder-format org-attach-id-ts-folder-format) -- 2.35.1 -- Ihor Radchenko, Org mode contributor, Learn more about Org mode at https://orgmode.org/. Support Org development at https://liberapay.com/org-mode, or support my work at https://liberapay.com/yantar92
Re: [PATCH] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
Max Nikulin writes: > On 23/07/2022 12:22, Ihor Radchenko wrote: >> Tentative patch is attached. > >> + (if (< (length id) 3) >> + (org-attach-id-uuid-folder-format (md5 id)) > > Ihor, I am afraid of collisions due to short input to md5. Long hash > value gives false impression of high entropy but actually there are not > so many variants for 1 or 2 characters strings. Either there is no point > in making long name from short id or random string of appropriate length > should be used instead. Random is not an option. If we use random string, how will we get the right attachment path on a second call? This function must return the same value given the same input. Having said that, I do agree (taking into account the other comment) that md5 may not be a very good idea. Maybe simply something like "__/id". Ideally, we should be able to recover the original id from the path to attachment. Best, Ihor
Re: [PATCH] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
On 23/07/2022 12:22, Ihor Radchenko wrote: Tentative patch is attached. + (if (< (length id) 3) + (org-attach-id-uuid-folder-format (md5 id)) Ihor, I am afraid of collisions due to short input to md5. Long hash value gives false impression of high entropy but actually there are not so many variants for 1 or 2 characters strings. Either there is no point in making long name from short id or random string of appropriate length should be used instead.
Re: [PATCH] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
Just now had a closer look at your patch, and I don't like the hashing. My whole idea customizing ids is to make the file structure useful by itself, too. This is how I patched it for me: (defun xf/org-attach-id-folder-format (id) "Translate any ID into a folder-path." (format "%s/%s" (substring id 0 2) (if (> (seq-length id) 2) (substring id 2) id)) ) (setq org-attach-id-to-path-function-list '(xf/org-attach-id-folder-format)) --- Original Message --- Ihor Radchenko schrieb am Samstag, 23. Juli 2022 um 07:21: > Janek F xer...@pm.me writes: > > > When setting org-id-method to 'ts or 'org, > > org-attach seems to use org-attach-id-ts-folder-format > > to create its hierarchy. > > > > However I tend to customize IDs for important files by hand, > > causing any attempt to use org-attach on that file to fail > > if the ID is shorter than six characters: > > > > org-attach-id-ts-folder-format: Args out of range: "ftt", 0, 6 > > > > This method should be adjusted to handle non-ts-ids just as well, > > as org-id-method does not dictate the format of existing ids. > > > Thanks for reporting! > Tentative patch is attached. > > I have added some fallbacks for the default attach folder formatters, so > that they can work when the ID does not conform to specific format. > > This change is somewhat opinionated, so feel free to suggest alternative > solutions/fallback options. > > Best, > Ihor
[PATCH] Re: [BUG] org-attach-id-ts-folder-format fails on customized IDs [9.6 (9.6-??-2e9999783)]
Janek F writes: > When setting org-id-method to 'ts or 'org, > org-attach seems to use org-attach-id-ts-folder-format > to create its hierarchy. > > However I tend to customize IDs for important files by hand, > causing any attempt to use org-attach on that file to fail > if the ID is shorter than six characters: > > org-attach-id-ts-folder-format: Args out of range: "ftt", 0, 6 > > This method should be adjusted to handle non-ts-ids just as well, > as org-id-method does not dictate the format of existing ids. Thanks for reporting! Tentative patch is attached. I have added some fallbacks for the default attach folder formatters, so that they can work when the ID does not conform to specific format. This change is somewhat opinionated, so feel free to suggest alternative solutions/fallback options. Best, Ihor >From e004752ba39f0d328645a1f6053ad3ce3b06ac28 Mon Sep 17 00:00:00 2001 Message-Id: From: Ihor Radchenko Date: Sat, 23 Jul 2022 13:13:24 +0800 Subject: [PATCH] org-attach-dir-from-id: Do not rely on ID being over 6 chars long * lisp/org-attach.el (org-attach-id-uuid-folder-format): Fall back to using ID md5 hash when the ID contains 2 chars or less and cannot be split into the "xy/z" path. (org-attach-id-ts-folder-format): Fall back to "unknown/ID" path format when the ID contains less than 7 chars and cannot be split into the "MM/rest" path. Fixes https://orgmode.org/list/KC8PcypJapBpJQtJxM0kX5N7Z0THL2Lq6EQjBMzpw1-vgQf72egZ2JOIlTbPYiqAVD4MdSBhrhBZr2Ykf5DN1mocm1ANvvuKKZShlkgzKYM=@pm.me --- lisp/org-attach.el | 25 + 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/lisp/org-attach.el b/lisp/org-attach.el index 36c21b702..7c72fd7ee 100644 --- a/lisp/org-attach.el +++ b/lisp/org-attach.el @@ -159,19 +159,28 @@ (defcustom org-attach-archive-delete nil (defun org-attach-id-uuid-folder-format (id) "Translate an UUID ID into a folder-path. Default format for how Org translates ID properties to a path for -attachments. Useful if ID is generated with UUID." - (format "%s/%s" - (substring id 0 2) - (substring id 2))) +attachments. Useful if ID is generated with UUID. + +When ID is too short (less than 3 chars), use its md5 hash to create +the path." + (if (< (length id) 3) + (org-attach-id-uuid-folder-format (md5 id)) +(format "%s/%s" + (substring id 0 2) + (substring id 2 (defun org-attach-id-ts-folder-format (id) "Translate an ID based on a timestamp to a folder-path. Useful way of translation if ID is generated based on ISO8601 timestamp. Splits the attachment folder hierarchy into -year-month, the rest." - (format "%s/%s" - (substring id 0 6) - (substring id 6))) +year-month, the rest. + +When ID is too short (less than 7 chars), return \"unknown/ID\"." + (if (< (length id) 7) + (format "unknown/%s" id) +(format "%s/%s" + (substring id 0 6) + (substring id 6 (defcustom org-attach-id-to-path-function-list '(org-attach-id-uuid-folder-format org-attach-id-ts-folder-format) -- 2.35.1