[ 
https://issues.apache.org/jira/browse/CASSANDRA-14575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16550730#comment-16550730
 ] 

Jason Brown edited comment on CASSANDRA-14575 at 7/20/18 12:44 PM:
-------------------------------------------------------------------

[~iamaleksey] mentioned this in CASSANDRA-14574
{quote}As for when it's safe to ignore a failed to deser message - at least in 
the case of unknown table id it is, and that's a common enough scenario. Think 
someone creates a table and starts writes before waiting for schema to 
propagate. Or batchlog replays a mutation to a node on which a table is either 
not yet known, or has been dropped since. Or, occasionally, when we add new 
tables and use them during mixed mode/upgrade period.
{quote}
Yup, at a minimum I think we can handle this one here, as we already know the 
payload length, and it's easy to skip beyond this message. Possibly all payload 
deser failures are ok are to ignore, as long as we can skip the payload bytes.

If, however, we fail in parsing any other part of the message (magic, params, 
and so on), we're pretty much screwed and will need to shut down the connection.

ftr, in CASSANDRA-14447, if we receive a verb id we do not recognize, we'll 
still deserialize the message ([payload is 
skipped|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessageIn.java#L150],
 as we have no serializer for the verb id we don't know), and the message [will 
be 
dropped|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessageDeliveryTask.java#L66]
 in {{MessageDeliveryTask.process()}}. Thus, we don't need to worry about 
unknown verbs for this ticket.

Reagrding params out of the message, with CASSANDRA-7544, we introduced an enum 
for all parameter names, {{ParameterType}}. In 
{{MessageInHandler.readParams()}}, we [look up the 
{{ParameterType}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/async/MessageInHandler.java#L160]
 based on the string we read from the stream. If the string is unknown (it came 
from a future version), the lookup returns null. -Several lines later we'll try 
to insert that null key into the {{EnumMap}}, and we'll get an exception and 
end up closing the connection. It's simple enough to still deserialize the key 
and value, and just not insert into the map if the key is null (effectively 
ignoring the param). This should be handled by this ticket, as well.-


 -CORRECTION: we need to know the {{parameterType}} to deserialize the value. 
Thus we have to fail the connection as we won't know the length of the value to 
read from the buf-.

Ughh, I need more coffee. CORRECTION 2: We *do* know the length of the value to 
read, we just won't know how to deserialize it. Thus, we can just the deser and 
inserting into the map step, and retain the connection as we'll still be in a 
deterministic state of the byte stream.


was (Author: jasobrown):
[~iamaleksey] mentioned this in CASSANDRA-14574
{quote}As for when it's safe to ignore a failed to deser message - at least in 
the case of unknown table id it is, and that's a common enough scenario. Think 
someone creates a table and starts writes before waiting for schema to 
propagate. Or batchlog replays a mutation to a node on which a table is either 
not yet known, or has been dropped since. Or, occasionally, when we add new 
tables and use them during mixed mode/upgrade period.
{quote}
Yup, at a minimum I think we can handle this one here, as we already know the 
payload length, and it's easy to skip beyond this message. Possibly all payload 
deser failures are ok are to ignore, as long as we can skip the payload bytes.

If, however, we fail in parsing any other part of the message (magic, params, 
and so on), we're pretty much screwed and will need to shut down the connection.

ftr, in CASSANDRA-14447, if we receive a verb id we do not recognize, we'll 
still deserialize the message ([payload is 
skipped|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessageIn.java#L150],
 as we have no serializer for the verb id we don't know), and the message [will 
be 
dropped|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessageDeliveryTask.java#L66]
 in {{MessageDeliveryTask.process()}}. Thus, we don't need to worry about 
unknown verbs for this ticket.

Reagrding params out of the message, with CASSANDRA-7544, we introduced an enum 
for all parameter names, {{ParameterType}}. In 
{{MessageInHandler.readParams()}}, we [look up the 
{{ParameterType}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/async/MessageInHandler.java#L160]
 based on the string we read from the stream. If the string is unknown (it came 
from a future version), the lookup returns null. -Several lines later we'll try 
to insert that null key into the {{EnumMap}}, and we'll get an exception and 
end up closing the connection. It's simple enough to still deserialize the key 
and value, and just not insert into the map if the key is null (effectively 
ignoring the param). This should be handled by this ticket, as well.- 
 -CORRECTION: we need to know the {{parameterType}} to deserialize the value. 
Thus we have to fail the connection as we won't know the length of the value to 
read from the buf.\-

Ughh, I need more coffee. CORRECTION 2: We *do* know the length of the value to 
read, we just won't know how to deserialize it. Thus, we can just the deser and 
inserting into the map step, and retain the connection as we'll still be in a 
deterministic state of the byte stream.

> Reevaluate when to drop an internode connection on message error
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-14575
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14575
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>             Fix For: 4.0
>
>
> As mentioned in CASSANDRA-14574, explore if and when we can safely ignore an 
> incoming internode message on certain classes of failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to