spark git commit: [SPARK-23649][SQL] Skipping chars disallowed in UTF-8

2018-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.2 175b221bc -> 367a16118 [SPARK-23649][SQL] Skipping chars disallowed in UTF-8 The mapping of UTF-8 char's first byte to char's size doesn't cover whole range 0-255. It is defined only for 0-253:

spark git commit: [SPARK-23649][SQL] Skipping chars disallowed in UTF-8

2018-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/branch-2.3 c854b6ca7 -> 0b880db65 [SPARK-23649][SQL] Skipping chars disallowed in UTF-8 ## What changes were proposed in this pull request? The mapping of UTF-8 char's first byte to char's size doesn't cover whole range 0-255. It is defined

spark git commit: [SPARK-23649][SQL] Skipping chars disallowed in UTF-8

2018-03-20 Thread wenchen
Repository: spark Updated Branches: refs/heads/master 566321852 -> 5e7bc2ace [SPARK-23649][SQL] Skipping chars disallowed in UTF-8 ## What changes were proposed in this pull request? The mapping of UTF-8 char's first byte to char's size doesn't cover whole range 0-255. It is defined only