[ 
https://issues.apache.org/jira/browse/CASSANDRA-20425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nadav Har'El updated CASSANDRA-20425:
-------------------------------------
    Description: 
Cassandra's documentation 
[https://cassandra.apache.org/doc/stable/cassandra/cql/ddl.html] claims that:

Both keyspace and table name ... are limited in size to 48 characters (that 
limit exists mostly to avoid filenames (which may include the keyspace and 
table name) to go over the limits of certain file systems). 

I checked, and although this limitation was true in Cassandra 3 but this limit 
is no longer enforced since Cassandra 4. It seems this change is not 
intentional, and happened eight years ago in commit [this 
commit|https://github.com/apache/cassandra/commit/af3fe39dcabd9ef77a00309ce6741268423206df].
 Before this commit, we had

{{public static boolean isNameValid(String name)   }}{{{}}

{{  return name != null && !name.isEmpty() && name.length() <= 
SchemaConstants.NAME_LENGTH && PATTERN_WORD_CHARS.matcher(name).matches();   }}

{{ }}}

After it, this definition was dropped, IndexMetadata.isNameValid() is used, and 
this one never had a name length check.

We can see that this dropping of the limit check was unintentional in several 
ways. First, the documentation still mentions this no-longer-existing 
limitation. Second, the code still has SchemaConstants.NAME_LENGTH existing and 
set to 48, but no longer used. Or rather- it is used - but only in error 
messages, which no longer match what is actually being tested!

I think Cassandra should either return this name length limit that was 
accidentally dropped 8 years ago, or we could decide officially that this name 
length limit is lifted - and drop it from the documentation and also all 
mentions of NAME_LENGTH in error messages (and remove this constant entirely).

But before deciding what to do, I want to make another point. The documentation 
rightly explained the original raison d'etre for this 48-character limitation: 
"that limit exists mostly to avoid filenames to go over the limits of certain 
file systems". This reason is still relevant. I checked what happens when on a 
standard Linux filesystem I try to create a table name with a 300-character 
name. On Cassandra 4, the behavior was reasonable - I got a bizarre error, but 
afterwards the database was still functional. On Cassandra 5, the result was a 
disaster - Cassandra hang on some strange loop of IO errors and never 
recovered. So if the decision is to allow longer-than-48-chars table names, as 
were allowed in the last 8 years, we should consider whether we should have a 
higher limit instead, or perhaps try to catch the error of filenames too long 
for this filesystem and fail more gracefully.

  was:
Cassandra's documentation 
[https://cassandra.apache.org/doc/stable/cassandra/cql/ddl.html] claims that:

Both keyspace and table name ... are limited in size to 48 characters (that 
limit exists mostly to avoid filenames (which may include the keyspace and 
table name) to go over the limits of certain file systems). 

I checked, and although this limitation was true in Cassandra 3 but this limit 
is no longer enforced since Cassandra 4. It seems this change is not 
intentional, and happened eight years ago in commit 
[https://github.com/apache/cassandra/commit/af3fe39dcabd9ef77a00309ce6741268423206df|https://github.com/apache/cassandra/commit/af3fe39dcabd9ef77a00309ce6741268423206df.]
 Before this commit, we had

    public static boolean isNameValid(String name)
   

{         return name != null && !name.isEmpty()                && 
name.length() <= SchemaConstants.NAME_LENGTH && 
PATTERN_WORD_CHARS.matcher(name).matches();     }

After it, this definition was dropped, IndexMetadata.isNameValid() is used, and 
this one never had a name length check.

We can see that this dropping of the limit check was unintentional in several 
ways. First, the documentation still mentions this no-longer-existing 
limitation. Second, the code still has SchemaConstants.NAME_LENGTH existing and 
set to 48, but no longer used. Or rather- it is used - but only in error 
messages, which no longer match what is actually being tested!

I think Cassandra should either return this name length limit that was 
accidentally dropped 8 years ago, or we could decide officially that this name 
length limit is lifted - and drop it from the documentation and also all 
mentions of NAME_LENGTH in error messages (and remove this constant entirely).

But before deciding what to do, I want to make another point. The documentation 
rightly explained the original raison d'etre for this 48-character limitation: 
"that limit exists mostly to avoid filenames to go over the limits of certain 
file systems". This reason is still relevant. I checked what happens when on a 
standard Linux filesystem I try to create a table name with a 300-character 
name. On Cassandra 4, the behavior was reasonable - I got a bizarre error, but 
afterwards the database was still functional. On Cassandra 5, the result was a 
disaster - Cassandra hang on some strange loop of IO errors and never 
recovered. So if the decision is to allow longer-than-48-chars table names, as 
were allowed in the last 8 years, we should consider whether we should have a 
higher limit instead, or perhaps try to catch the error of filenames too long 
for this filesystem and fail more gracefully.


> Table name length validation accidentally lost since Cassandra 4
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-20425
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-20425
>             Project: Apache Cassandra
>          Issue Type: Bug
>            Reporter: Nadav Har'El
>            Priority: Normal
>
> Cassandra's documentation 
> [https://cassandra.apache.org/doc/stable/cassandra/cql/ddl.html] claims that:
> Both keyspace and table name ... are limited in size to 48 characters (that 
> limit exists mostly to avoid filenames (which may include the keyspace and 
> table name) to go over the limits of certain file systems). 
> I checked, and although this limitation was true in Cassandra 3 but this 
> limit is no longer enforced since Cassandra 4. It seems this change is not 
> intentional, and happened eight years ago in commit [this 
> commit|https://github.com/apache/cassandra/commit/af3fe39dcabd9ef77a00309ce6741268423206df].
>  Before this commit, we had
> {{public static boolean isNameValid(String name)   }}{{{}}
> {{  return name != null && !name.isEmpty() && name.length() <= 
> SchemaConstants.NAME_LENGTH && PATTERN_WORD_CHARS.matcher(name).matches();   
> }}
> {{ }}}
> After it, this definition was dropped, IndexMetadata.isNameValid() is used, 
> and this one never had a name length check.
> We can see that this dropping of the limit check was unintentional in several 
> ways. First, the documentation still mentions this no-longer-existing 
> limitation. Second, the code still has SchemaConstants.NAME_LENGTH existing 
> and set to 48, but no longer used. Or rather- it is used - but only in error 
> messages, which no longer match what is actually being tested!
> I think Cassandra should either return this name length limit that was 
> accidentally dropped 8 years ago, or we could decide officially that this 
> name length limit is lifted - and drop it from the documentation and also all 
> mentions of NAME_LENGTH in error messages (and remove this constant entirely).
> But before deciding what to do, I want to make another point. The 
> documentation rightly explained the original raison d'etre for this 
> 48-character limitation: "that limit exists mostly to avoid filenames to go 
> over the limits of certain file systems". This reason is still relevant. I 
> checked what happens when on a standard Linux filesystem I try to create a 
> table name with a 300-character name. On Cassandra 4, the behavior was 
> reasonable - I got a bizarre error, but afterwards the database was still 
> functional. On Cassandra 5, the result was a disaster - Cassandra hang on 
> some strange loop of IO errors and never recovered. So if the decision is to 
> allow longer-than-48-chars table names, as were allowed in the last 8 years, 
> we should consider whether we should have a higher limit instead, or perhaps 
> try to catch the error of filenames too long for this filesystem and fail 
> more gracefully.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to