[jira] [Commented] (PROTON-834) proton-j: UTF-8 encoder reporting some three byte characters as invalid surrogates
[ https://issues.apache.org/jira/browse/PROTON-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14512662#comment-14512662 ] ASF subversion and git services commented on PROTON-834: Commit 810088b14dedcd12a9474687ba9cd05fc8297188 in qpid-proton's branch refs/heads/0.9.x from [~dnwe] [ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=810088b ] PROTON-834: further UTF-8 encoder fixes After commit c65e897 it turned out there were still some issues with strings containing a codepoint >0xDBFF which was being incorrectly treated as a surrogate pair in the calculateUTF8Length method. Fixed this up and added some more test coverage. Closes #13 (cherry picked from commit 7b9b516d445ab9e86a0313709c77218d901435b1) > proton-j: UTF-8 encoder reporting some three byte characters as invalid > surrogates > -- > > Key: PROTON-834 > URL: https://issues.apache.org/jira/browse/PROTON-834 > Project: Qpid Proton > Issue Type: Bug > Components: proton-j >Affects Versions: 0.8 >Reporter: Dominic Evans >Assignee: Dominic Evans > > Following on from the fixes made under PROTON-576, some UTF-8 characters were > getting incorrectly reported as invalid surrogates, when they were valid > 3-byte encodings. > e.g., > !!! > (╯°□°)╯︵ ┻━┻ > etc. > This is an issue when streaming variable content such as Twitter messages > which can often contain such characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PROTON-834) proton-j: UTF-8 encoder reporting some three byte characters as invalid surrogates
[ https://issues.apache.org/jira/browse/PROTON-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363413#comment-14363413 ] ASF subversion and git services commented on PROTON-834: Commit 810088b14dedcd12a9474687ba9cd05fc8297188 in qpid-proton's branch refs/heads/0.9 from [~dnwe] [ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=810088b ] PROTON-834: further UTF-8 encoder fixes After commit c65e897 it turned out there were still some issues with strings containing a codepoint >0xDBFF which was being incorrectly treated as a surrogate pair in the calculateUTF8Length method. Fixed this up and added some more test coverage. Closes #13 (cherry picked from commit 7b9b516d445ab9e86a0313709c77218d901435b1) > proton-j: UTF-8 encoder reporting some three byte characters as invalid > surrogates > -- > > Key: PROTON-834 > URL: https://issues.apache.org/jira/browse/PROTON-834 > Project: Qpid Proton > Issue Type: Bug > Components: proton-j >Affects Versions: 0.8 >Reporter: Dominic Evans >Assignee: Dominic Evans > > Following on from the fixes made under PROTON-576, some UTF-8 characters were > getting incorrectly reported as invalid surrogates, when they were valid > 3-byte encodings. > e.g., > !!! > (╯°□°)╯︵ ┻━┻ > etc. > This is an issue when streaming variable content such as Twitter messages > which can often contain such characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PROTON-834) proton-j: UTF-8 encoder reporting some three byte characters as invalid surrogates
[ https://issues.apache.org/jira/browse/PROTON-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363394#comment-14363394 ] ASF GitHub Bot commented on PROTON-834: --- Github user asfgit closed the pull request at: https://github.com/apache/qpid-proton/pull/13 > proton-j: UTF-8 encoder reporting some three byte characters as invalid > surrogates > -- > > Key: PROTON-834 > URL: https://issues.apache.org/jira/browse/PROTON-834 > Project: Qpid Proton > Issue Type: Bug > Components: proton-j >Affects Versions: 0.8 >Reporter: Dominic Evans >Assignee: Dominic Evans > > Following on from the fixes made under PROTON-576, some UTF-8 characters were > getting incorrectly reported as invalid surrogates, when they were valid > 3-byte encodings. > e.g., > !!! > (╯°□°)╯︵ ┻━┻ > etc. > This is an issue when streaming variable content such as Twitter messages > which can often contain such characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PROTON-834) proton-j: UTF-8 encoder reporting some three byte characters as invalid surrogates
[ https://issues.apache.org/jira/browse/PROTON-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363393#comment-14363393 ] ASF subversion and git services commented on PROTON-834: Commit 7b9b516d445ab9e86a0313709c77218d901435b1 in qpid-proton's branch refs/heads/master from [~dnwe] [ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=7b9b516 ] PROTON-834: further UTF-8 encoder fixes After commit c65e897 it turned out there were still some issues with strings containing a codepoint >0xDBFF which was being incorrectly treated as a surrogate pair in the calculateUTF8Length method. Fixed this up and added some more test coverage. Closes #13 > proton-j: UTF-8 encoder reporting some three byte characters as invalid > surrogates > -- > > Key: PROTON-834 > URL: https://issues.apache.org/jira/browse/PROTON-834 > Project: Qpid Proton > Issue Type: Bug > Components: proton-j >Affects Versions: 0.8 >Reporter: Dominic Evans >Assignee: Dominic Evans > > Following on from the fixes made under PROTON-576, some UTF-8 characters were > getting incorrectly reported as invalid surrogates, when they were valid > 3-byte encodings. > e.g., > !!! > (╯°□°)╯︵ ┻━┻ > etc. > This is an issue when streaming variable content such as Twitter messages > which can often contain such characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PROTON-834) proton-j: UTF-8 encoder reporting some three byte characters as invalid surrogates
[ https://issues.apache.org/jira/browse/PROTON-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363365#comment-14363365 ] ASF GitHub Bot commented on PROTON-834: --- Github user gemmellr commented on the pull request: https://github.com/apache/qpid-proton/pull/13#issuecomment-81751109 Looks good to me. We should request for inclusion in 0.9 if there is an RC3 to pick up aconways SSL change. > proton-j: UTF-8 encoder reporting some three byte characters as invalid > surrogates > -- > > Key: PROTON-834 > URL: https://issues.apache.org/jira/browse/PROTON-834 > Project: Qpid Proton > Issue Type: Bug > Components: proton-j >Affects Versions: 0.8 >Reporter: Dominic Evans >Assignee: Dominic Evans > > Following on from the fixes made under PROTON-576, some UTF-8 characters were > getting incorrectly reported as invalid surrogates, when they were valid > 3-byte encodings. > e.g., > !!! > (╯°□°)╯︵ ┻━┻ > etc. > This is an issue when streaming variable content such as Twitter messages > which can often contain such characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PROTON-834) proton-j: UTF-8 encoder reporting some three byte characters as invalid surrogates
[ https://issues.apache.org/jira/browse/PROTON-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14363325#comment-14363325 ] ASF GitHub Bot commented on PROTON-834: --- GitHub user dnwe opened a pull request: https://github.com/apache/qpid-proton/pull/13 PROTON-834: further UTF-8 encoder fixes After commit c65e897 it turned out there were still some issues with strings containing a codepoint >0xDBFF which was being incorrectly treated as a surrogate pair in the calculateUTF8Length method. Fixed this up and added some more test coverage. You can merge this pull request into a Git repository by running: $ git pull https://github.com/dnwe/qpid-proton fix-proton-834 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/qpid-proton/pull/13.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13 commit dc52650e7de53ef5fe294b9066620b4698c30a94 Author: Dominic Evans Date: 2015-03-16T12:18:20Z PROTON-834: further UTF-8 encoder fixes After commit c65e897 it turned out there were still some issues with strings containing a codepoint >0xDBFF which was being incorrectly treated as a surrogate pair in the calculateUTF8Length method. Fixed this up and added some more test coverage. > proton-j: UTF-8 encoder reporting some three byte characters as invalid > surrogates > -- > > Key: PROTON-834 > URL: https://issues.apache.org/jira/browse/PROTON-834 > Project: Qpid Proton > Issue Type: Bug > Components: proton-j >Affects Versions: 0.8 >Reporter: Dominic Evans >Assignee: Dominic Evans > > Following on from the fixes made under PROTON-576, some UTF-8 characters were > getting incorrectly reported as invalid surrogates, when they were valid > 3-byte encodings. > e.g., > !!! > (╯°□°)╯︵ ┻━┻ > etc. > This is an issue when streaming variable content such as Twitter messages > which can often contain such characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PROTON-834) proton-j: UTF-8 encoder reporting some three byte characters as invalid surrogates
[ https://issues.apache.org/jira/browse/PROTON-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14348396#comment-14348396 ] ASF subversion and git services commented on PROTON-834: Commit c65e89730f67cd3a8aa31c0d0de491b20810c99f in qpid-proton's branch refs/heads/master from [~dnwe] [ https://git-wip-us.apache.org/repos/asf?p=qpid-proton.git;h=c65e897 ] PROTON-834: modified UTF-8 encoder fixes Commit 5069bb6 applied a modified version of a patch I submitted, to ensure that the UTF-8 encoder (and UTF-8 byte length calculator) would cope with surrogate pairs. This commit fixes an issue with three byte characters in the <= 0x range being incorrectly detected as invalid four byte surrogates. Closes #10 > proton-j: UTF-8 encoder reporting some three byte characters as invalid > surrogates > -- > > Key: PROTON-834 > URL: https://issues.apache.org/jira/browse/PROTON-834 > Project: Qpid Proton > Issue Type: Bug > Components: proton-j >Affects Versions: 0.8 >Reporter: Dominic Evans > Fix For: 0.9 > > > Following on from the fixes made under PROTON-576, some UTF-8 characters were > getting incorrectly reported as invalid surrogates, when they were valid > 3-byte encodings. > e.g., > !!! > (╯°□°)╯︵ ┻━┻ > etc. > This is an issue when streaming variable content such as Twitter messages > which can often contain such characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PROTON-834) proton-j: UTF-8 encoder reporting some three byte characters as invalid surrogates
[ https://issues.apache.org/jira/browse/PROTON-834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14348397#comment-14348397 ] ASF GitHub Bot commented on PROTON-834: --- Github user asfgit closed the pull request at: https://github.com/apache/qpid-proton/pull/10 > proton-j: UTF-8 encoder reporting some three byte characters as invalid > surrogates > -- > > Key: PROTON-834 > URL: https://issues.apache.org/jira/browse/PROTON-834 > Project: Qpid Proton > Issue Type: Bug > Components: proton-j >Affects Versions: 0.8 >Reporter: Dominic Evans > Fix For: 0.9 > > > Following on from the fixes made under PROTON-576, some UTF-8 characters were > getting incorrectly reported as invalid surrogates, when they were valid > 3-byte encodings. > e.g., > !!! > (╯°□°)╯︵ ┻━┻ > etc. > This is an issue when streaming variable content such as Twitter messages > which can often contain such characters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)