?????? Let Flink SQL PlannerExpressionParserImpl#FieldRefrence use Unicode as its default charset

?????? Thu, 16 Jan 2020 00:15:02 -0800

Hi,&nbsp;
&nbsp; &nbsp; What I am talking about is the `PlannerExpressionParserImpl`, 
which is written by Scala Parser tool, Every time we call&nbsp; 
StreamTableEnvironment#FromDataStream, the field String (or maybe scala.Symbol 
by scala Api) shall be parsed by `PlannerExpressionParserImpl&nbsp;` into 
`Expression`.
As we can see the&nbsp; parser grammar&nbsp; written in 
`PlannerExpressionParserImpl `, the `fieldRefrence` is&nbsp; defined by `*` or 
`ident`.&nbsp; &nbsp;`ident` in&nbsp; &nbsp; `PlannerExpressionParserImpl` is 
just the&nbsp; one in [[scala.util.parsing.combinator.JavaTokenParsers]]&nbsp; 
which is JavaIdentifier.&nbsp;



&nbsp; &nbsp;After discussed with Jark????????, I also discovered that 
`PlannerExpressionParserImpl` currrently even does not support quote ??'`'). I 
did't know what&nbsp; u just told me about Calcite before. But it doesn't 
matter. Well maybe we can just&nbsp; let 
PlannerExpressionParserImpl#FieldRefrence use Unicode as its default charset 
and support '`'&nbsp; &nbsp;for the first step, and then make the whole project 
supports Unicode charset&nbsp; when Calcite related part is available.




btw I have been to ur lecture in FFA Asia on Calcite, which really inspired me 
a lot~
&nbsp;





Best Regards
??????Shoi Liu&nbsp;
????????????




&nbsp;




------------------&nbsp;????????&nbsp;------------------
??????:&nbsp;"Danny Chan"<yuzhao....@gmail.com&gt;;
????????:&nbsp;2020??1??16??(??????) ????12:45
??????:&nbsp;"??????"<john_aka_n...@qq.com&gt;;

????:&nbsp;Re: Let Flink SQL PlannerExpressionParserImpl#FieldRefrence use 
Unicode as its default charset



  User defined charset for DB/session/table/column is not supported yet for 
Flink now, specifically, Flink use Calcite as the panner engine that also does 
not support configurable charset well, there is a design doc [1] but has never 
been implemented. Apache Calcite??s default system charset is ??ISO-8859-1??. 

 Actually I??m a little confused about your description, do you mean the 
charset of SqlIdentifier or the string literal ? They are different topics.
 

 
[1]&nbsp;https://docs.google.com/document/d/1wo5byn_6K_YOKiPdXNav1zgzt9IBC3SbPvpPnIShtXk/edit#heading=h.g4bnumde4dl5
 
 
 
 
 Best, Danny Chan
 
 
 ?? 2020??1??15?? +0800 PM11:08???????? <john_aka_n...@qq.com&gt;????????
  Hi all,
 &nbsp;the related issue:https://issues.apache.org/jira/browse/FLINK-15573
 

 &nbsp; As the title tells, what I do want to do is let the `FieldRefrence` use 
Unicode as its default charset (or maybe as an optional&nbsp; charset which can 
be configured).
 According to the&nbsp; `PlannerExpressionParserImpl`, currently FLINK uses 
JavaIdentifier as&nbsp; &nbsp;`FieldRefrence`??s default charset. But, from my 
perspective, it is not enough. Considering that user who uses ElasticSearch as 
sink??we all know that ES has A field called `@timestamp`, which JavaIdentifier 
cannot meet.
 

 &nbsp; So in my team, we just let `PlannerExpressionParserImpl#FieldRefrence` 
use Unicode as its default charset so that solves this kind of problem. (Plz 
refer to the issue I mentioned above )
 

 In my Opinion, the change shall be for general purpose:
 &nbsp;Firstly, Mysql supports unicode as default field charset, see the field 
named `@@`, so shall we support unicode also?
 <db4b1...@cdf2b370.c12a1f5e.jpg&gt;
 

 &nbsp; What?? s more,&nbsp; my team really get a lot of benefits&nbsp; from 
this change. I also believe that it can give other users more benefits without 
even any harm!
 &nbsp; Fortunately, the change supports fully forwards compatibility.Cuz 
Unicode is the superset of&nbsp; JavaIdentifier. Only a few code change can 
achieve this goal.
 &nbsp; Looking forward for any opinion.
 &nbsp;
 &nbsp;btw, thanks to tison~
 

  
  

 Best Regards
 ?????? Shoi Liu
 

 
 
 &nbsp;

?????? Let Flink SQL PlannerExpressionParserImpl#FieldRefrence use Unicode as its default charset

Reply via email to