Simon Lundstrom created AMQ-8398: ------------------------------------ Summary: 4-byte Unicode message from JMS to STOMP will be corrupted Key: AMQ-8398 URL: https://issues.apache.org/jira/browse/AMQ-8398 Project: ActiveMQ Issue Type: Bug Components: Broker, STOMP, Transport Affects Versions: 5.16.3 Reporter: Simon Lundstrom
When sending a message from: JMS producer to STOMP consumer or STOMP producer to JMS consumer which contains a 4-byte unicode code points e.g. https://unicode-table.com/en/1F5A4/ there is a corruption of the message. In the JMS to STOMP case the code point gets converted to: {{ef bf bd ef bf bd}} when it should be {{f0 9f 96 a4}}. and in the STOMP to JMS case the JMS client throws an exception: {code} Exception in thread "main" javax.jms.JMSException: java.io.UTFDataFormatException at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:72) at org.apache.activemq.command.ActiveMQTextMessage.decodeContent(ActiveMQTextMessage.java:104) at org.apache.activemq.command.ActiveMQTextMessage.getText(ActiveMQTextMessage.java:84) at testkonsument.App.JMS(App.java:86) at testkonsument.App.main(App.java:42) Caused by: java.io.UTFDataFormatException at org.apache.activemq.util.MarshallingSupport.convertUTF8WithBuf(MarshallingSupport.java:389) at org.apache.activemq.util.MarshallingSupport.readUTF8(MarshallingSupport.java:358) at org.apache.activemq.command.ActiveMQTextMessage.decodeContent(ActiveMQTextMessage.java:101) ... 3 more {code} Using 4-byte unicode points from STOMP to STOMP or from JMS to JMS is not a problem, both works and does not corrupt the code point. Note that 2- or 3-byte Unicode code points does NOT get corrupted, even if the same message includes a 4-byte Unicode code point. -- This message was sent by Atlassian Jira (v8.3.4#803005)