Taking another swing at this. The 238 choice would mean that any valid 4.0 db name can be replicated back to <3.0 because, to succeed, those versions will make internal shard names that exceed common filesystem lengths.
Fine, I accept that. Backward compatibility is a tricky balance. My position is that 256 is a sensible limit for db name. It matches many filesystem limits, it is quite generous, and it is a small fraction of the fdb key / value limit. On compatibility, with 256 limit, you can replicate a < 4.0 db to 4.0 with the same name and back again (since any existing < 4.0 db must have successfully made internal shards). I think that is enough backward compatibility. I don't have a final say here, so please everybody that cares chime in now. My vote is for 256 character limit for 4.0 onward, non-configurable (all couchdb 4.0 installs, and anything claiming compatibility with it all supporting and enforcing this limit for onward replication compatibility). B. > On 12 May 2020, at 21:05, Robert Newson <[email protected]> wrote: > > I still don’t understand how the internal shard database name format has any > bearing on our public interface, present or future. > > -- > Robert Samuel Newson > [email protected] > > On Tue, 12 May 2020, at 19:52, Nick Vatamaniuc wrote: >> I still like it. It's only 18 bytes difference but it introduces one >> more compatibility issue. At least for 4.x, it would be nice to have >> less of those and we can always increase it later. But if other >> participants think it's too nitpick-y and odd I am happy to go with >> 256. >> >> -Nick >> >> On Tue, May 12, 2020 at 9:24 AM Robert Samuel Newson <[email protected]> >> wrote: >>> >>> Sorry to let this thread drop. >>> >>> Nick, are you still preferring 238? >>> >>> B. >>> >>>> On 4 May 2020, at 21:06, Robert Samuel Newson <[email protected]> wrote: >>>> >>>> Ah, ok, understood. I don't think that's a compelling reason to fix our >>>> maximum database name length at 238. >>>> >>>> CouchDB 4.0 will be the first version of CouchDB where we're not coupled >>>> to the filesystem for this list. 256 is very common for a filesystem >>>> filename length limit (though not universal) so I don't think our history >>>> should dictate an odd (fine, _even_) choice of 238. >>>> >>>> B. >>>> >>>> >>>>> On 4 May 2020, at 20:41, Nick Vatamaniuc <[email protected]> wrote: >>>>> >>>>> It will prevent replicating from db created in 4.0 which has a name >>>>> longer than 238 (say 250) back to 2.x/3.x if the user intends to keep >>>>> the same database name on both systems, that's what I meant. >>>>> >>>>> On Mon, May 4, 2020 at 3:15 PM Robert Samuel Newson <[email protected]> >>>>> wrote: >>>>>> >>>>>> The 'timestamp in filename' is only on the internal shards, which would >>>>>> not be part of a replication between 2.x/3.x and 4.x. >>>>>> >>>>>> In any case, Nick is suggesting lowering from 256 charts to 238 chars to >>>>>> leave room for these things that won't be there. I confess I don't >>>>>> understand the reasoning. >>>>>> >>>>>> B. >>>>>> >>>>>>> On 4 May 2020, at 20:04, Joan Touzet <[email protected]> wrote: >>>>>>> >>>>>>> I suspect he means when replicating back to a 3.x or 2.x cluster. >>>>>>> >>>>>>> On 2020-05-04 3:03 p.m., Robert Samuel Newson wrote: >>>>>>>> But we don't need to add a file extension or a timestamp to database >>>>>>>> names. >>>>>>>> B. >>>>>>>>> On 4 May 2020, at 18:42, Nick Vatamaniuc <[email protected]> wrote: >>>>>>>>> >>>>>>>>> Hello everyone, >>>>>>>>> >>>>>>>>> Good idea, +1 with one minor tweak: database name length in versions >>>>>>>>> <4.0 was restricted by the maximum file name on whatever file system >>>>>>>>> the server was running on. In practice that was 255, then there is an >>>>>>>>> extension and a timestamp in the filename which made the db name limit >>>>>>>>> be 238 so I suggest to use that instead. >>>>>>>>> >>>>>>>>> -Nick >>>>>>>>> >>>>>>>>> On Mon, May 4, 2020 at 11:51 AM Robert Samuel Newson >>>>>>>>> <[email protected]> wrote: >>>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> I think I speak for many in accepting the risk that we're excluding >>>>>>>>>> doc ids formed from 4096-bit RSA signatures. >>>>>>>>>> >>>>>>>>>> I don't think I made it clear but I think these should be fixed >>>>>>>>>> limits (i.e, not configurable) in order to ensure inter-replication >>>>>>>>>> between couchdb installations wherever they are. >>>>>>>>>> >>>>>>>>>> B. >>>>>>>>>> >>>>>>>>>>> On 4 May 2020, at 10:52, Ilya Khlopotov <[email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> >>>>>>>>>>> Thank you Robert for starting this important discussion. I think >>>>>>>>>>> that the values you propose make sense. >>>>>>>>>>> I can see a case when user would use hashes as document ids. All >>>>>>>>>>> existent hash functions I am aware of should return data which fit >>>>>>>>>>> into 512 characters. There is only one case which doesn't fit into >>>>>>>>>>> 512 limit. If user would decide to use RSA signatures as document >>>>>>>>>>> ids and they use 4096 bytes sized keys the signature size would be >>>>>>>>>>> 684 bytes. >>>>>>>>>>> >>>>>>>>>>> However in this case users can easily replace signatures with >>>>>>>>>>> hashes of signatures. So I wouldn't worry about it to much. 512 >>>>>>>>>>> sounds plenty to me. >>>>>>>>>>> >>>>>>>>>>> +1 to set hard limits on db name size and doc id size with proposed >>>>>>>>>>> values. >>>>>>>>>>> >>>>>>>>>>> Best regards, >>>>>>>>>>> iilyak >>>>>>>>>>> >>>>>>>>>>> On 2020/05/01 18:36:45, Robert Samuel Newson <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> There are other threads related to doc size (etc) limits for >>>>>>>>>>>> CouchDB 4.0, motivated by restrictions in FoundationDB, but we >>>>>>>>>>>> haven't discussed database name length and doc id length limits. >>>>>>>>>>>> These are encoded into FoundationDB keys and so we would be wise >>>>>>>>>>>> to forcibly limit their length from the start. >>>>>>>>>>>> >>>>>>>>>>>> I propose 256 character limit for database name and 512 character >>>>>>>>>>>> limit for doc ids. >>>>>>>>>>>> >>>>>>>>>>>> If you can't uniquely identify your database or document within >>>>>>>>>>>> those limits I argue that you're doing something wrong, and the >>>>>>>>>>>> limits here, while making FDB happy, are an aid to sensible >>>>>>>>>>>> application design. >>>>>>>>>>>> >>>>>>>>>>>> Does anyone want higher or lower limits? Comments pls. >>>>>>>>>>>> >>>>>>>>>>>> B. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>> >>>> >>> >>
