[ 
https://issues.apache.org/jira/browse/CASSANDRA-10374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037659#comment-15037659
 ] 

Sylvain Lebresne commented on CASSANDRA-10374:
----------------------------------------------

bq. Maybe we can add a new configuration option

We avoid adding new configuration unless the justification for it is strong, as 
multiplying configuration options ends up more confusing than helpful (it also 
need to be documented and whatnot). That case absolutely does not meet our 
standard for justifying a configuration option.

The thing is, using big values for collections is a bad idea in the first place 
as collection are always read entirely. Surely there is few cases that are 
reasonable yet run into that limit, and that's why we're looking at removing 
it, but this ticket is, I would argue, more of a an improvement and a minor 
one. Which justify in itself limiting it to 3.0.

Further, we're actually not lifting the limit entirely in 2.1/2.2: because we 
still have the internal limitation on the cell name, the only case where we 
effectively allow > 64k values are lists and map values (not map key in 
particular). And for list, I would argue that using large-ish values is an even 
worst idea than for collections in general since they sometime trigger a 
read-before-write. And I certainly wouldn't want this to become a justification 
for going with lists over sets as it's a bad idea in the long term.

So to sum up, in 2.1/2.2, committing this:
* is a problem for earlier protocol version
* create inconsistencies in collections (not all collections are limited 
similarly)
* is arguably only vaguely reasonable for map values.

I do strongly think 3.0-only is the most reasonable thing to do.

bq. we have implemented a table that is needing about 200KB per value with this 
bug in mind and assuming (probably bad idea) that this bug would have been 
fixed soon in 2.1/2.2

As hinted above, I'd argue that requiring 200KB values in collections is 
probably the first bad idea, but relying on the tentative "Fix versions" of an 
unresolved ticket is certainly something to avoid. But as my previous comment 
said, if you're in that unfortunate situation, you still have the option of 
applying the patch locally.

> List and Map values incorrectly limited to 64k size
> ---------------------------------------------------
>
>                 Key: CASSANDRA-10374
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10374
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Tyler Hobbs
>            Assignee: Benjamin Lerer
>            Priority: Minor
>             Fix For: 3.0.1, 3.1, 2.1.x, 2.2.x
>
>
> With the v3 native protocol, we switched from encoding collection element 
> sizes with shorts to ints.  However, in {{Lists.java}} and {{Maps.java}}, we 
> still validate that list and map values are smaller than 
> {{MAX_UNSIGNED_SHORT}}.
> Map keys and set elements are stored in the cell name, so they're implicitly 
> limited to the cell name size limit of 64k.  However, for non-frozen 
> collections, this limitation should not apply, so we probably don't want to 
> perform this check here for those either.
> The fix should include tests where we exceed the 64k limit for frozen and 
> non-frozen collections.  In the case of non-frozen lists and maps, we should 
> verify that the 64k cell-name size limit is enforced in a friendly way.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to