On 06/01/15 09:31, Jordan Justen wrote:
> Supplementary Plane characters can exist in UTF-16 files,
> but they are not valid UCS-2 characters.
> 
> For example, this python interpreter code:
>>>> import codecs
>>>> codecs.encode(u'\U00010300', 'utf-16')
> '\xff\xfe\x00\xd8\x00\xdf'
> 
> Therefore the UCS-4 0x00010300 character is encoded as two
> 16-bit numbers (0xd800 0xdf00) in a little endian UTF-16
> file.
> 
> For more information, see:
> http://en.wikipedia.org/wiki/UTF-16#U.2B10000_to_U.2B10FFFF
> 
> This test checks to make sure that BaseTools will reject these
> characters in UTF-16 files.
> 
> This test was fixed by the previous commit:
> "BaseTools/UniClassObject: Verify valid UCS-2 chars in UTF-16 .uni files"
> 
> Contributed-under: TianoCore Contribution Agreement 1.0
> Signed-off-by: Jordan Justen <jordan.l.jus...@intel.com>
> Cc: Yingke D Liu <yingke.d....@intel.com>
> Cc: Michael D Kinney <michael.d.kin...@intel.com>
> Cc: Laszlo Ersek <ler...@redhat.com>
> ---
>  BaseTools/Tests/CheckUnicodeSourceFiles.py | 15 +++++++++++++++
>  1 file changed, 15 insertions(+)
> 
> diff --git a/BaseTools/Tests/CheckUnicodeSourceFiles.py 
> b/BaseTools/Tests/CheckUnicodeSourceFiles.py
> index 0083ad8..39fd2fe 100644
> --- a/BaseTools/Tests/CheckUnicodeSourceFiles.py
> +++ b/BaseTools/Tests/CheckUnicodeSourceFiles.py
> @@ -81,6 +81,21 @@ class Tests(TestTools.BaseToolsTest):
>      def testUtf16InUniFile(self):
>          self.CheckFile('utf_16', shouldPass=True)
>  
> +    def testSupplementaryPlaneUnicodeCharInUtf16File(self):
> +        #
> +        # Supplementary Plane characters can exist in UTF-16 files,
> +        # but they are not valid UCS-2 characters.
> +        #
> +        # This test makes sure that BaseTools rejects these characters
> +        # if seen in a .uni file.
> +        #
> +        data = u'''
> +            #langdef en-US "English"
> +            #string STR_A #language en-US "CodePoint (\U00010300) > 0xFFFF"
> +        '''
> +
> +        self.CheckFile('utf_16', shouldPass=False, string=data)
> +
>  TheTestSuite = TestTools.MakeTheTestSuite(locals())
>  
>  if __name__ == '__main__':
> 

I'd propose to extend this with a test case that feeds binary data (not
unicode text) to the checker, and the data should look similar to the
printf example in my previous comment.

Thanks
Laszlo

------------------------------------------------------------------------------
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/edk2-devel

Reply via email to