+CC: Ciara Power, Telemetry library maintainer > From: fengchengwen [mailto:fengcheng...@huawei.com] > Sent: Saturday, 18 June 2022 05.52 > > On 2022/6/18 1:05, Stephen Hemminger wrote: > > On Fri, 17 Jun 2022 12:25:04 +0100 > > Bruce Richardson <bruce.richard...@intel.com> wrote: > > > >> On Fri, Jun 17, 2022 at 01:16:08PM +0200, Morten Brørup wrote: > >>>> From: Chengwen Feng [mailto:fengcheng...@huawei.com] > >>>> Sent: Friday, 17 June 2022 11.46 > >>>> > >>>> This patch supports escape special characters (including: > \",\\,/,\b, > >>>> /f,/n,/r,/t) when telemetry string. > >>>> This patch is used to support telemetry xxx-dump commands which > the > >>>> string may include special characters. > >>>> > >>>> Signed-off-by: Chengwen Feng <fengcheng...@huawei.com> > >>>> --- > >>>> lib/telemetry/telemetry.c | 96 > +++++++++++++++++++++++++++++++++++++-- > >>>> 1 file changed, 93 insertions(+), 3 deletions(-) > >>>> > >>>> diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c > >>>> index c6fd03a5ab..0f762f633e 100644 > >>>> --- a/lib/telemetry/telemetry.c > >>>> +++ b/lib/telemetry/telemetry.c > >>>> @@ -215,6 +215,94 @@ container_to_json(const struct rte_tel_data > *d, > >>>> char *out_buf, size_t buf_len) > >>>> return used; > >>>> } > >>>> > >>>> +static bool > >>>> +json_is_special_char(char ch) > >>>> +{ > >>>> + static unsigned char is_spec[256] = { 0 }; > >>>> + static bool init_once; > >>>> + > >>>> + if (!init_once) { > >>>> + is_spec['\"'] = 1; > >>>> + is_spec['\\'] = 1; > >>>> + is_spec['/'] = 1; > >>>> + is_spec['\b'] = 1; > >>>> + is_spec['\f'] = 1; > >>>> + is_spec['\n'] = 1; > >>>> + is_spec['\r'] = 1; > >>>> + is_spec['\t'] = 1; > >>>> + init_once = true; > >>>> + } > >>>> + > >>>> + return (bool)is_spec[(unsigned char)ch]; > >>>> +} > >> > >> According to the json spec at [1], the characters that need to be > escaped > >> are: > >> a) any characters <0x20 > >> b) inverted commas/quote character \" > >> c) the "reverse solidus character", better known to you and I as the > >> back-slash. > >> > >> Therefore, I think this table generation could be simplified, but > also > >> expanded using this. For completeness we should also see about > handling all > >> control characters if they are encountered. > >> > >> [1] https://www.rfc-editor.org/rfc/rfc8259.txt > >> > >> /Bruce > > > > Since it is trivial could be initializer? > > > > static const uint8_t is_spec[256] = { > > [0 ... 0x20] = 1, > > ['\"' ] = 1, > > ['\\' ] = 1, > > ['/'] = 1, > > > > etc > > > > Or we could change the telemetry API to disallow control characters? > > I was thinking about converting 0~0x20, but I don't think there's a > scenario. > > I prefer change the telemetry API to disallow control characters, and > this may not > be a violation of the ABI, if yes, the dpdk-telemetry.py will returns > an error.
I agree with Chengwen Feng. The telemetry data type is STRING, not BLOB. So we need to define exactly what the STRING type contains. I hope we can all agree that control characters should be disallowed. The more complicated question is: Do we want to use the ASCII character set only, or do we want to use UTF-8 encoded Unicode? Personally, think UTF-8 encoded Unicode is more future proof, and would vote for that. But I wouldn't reject limiting it to ASCII, and perhaps in the future introduce another data type for UTF-8 strings. UTF-8 is the modern choice, but it is incompatible with old stuff, e.g. many SNMP MIBs. > > So I think we could add declaring and checking functions to make sure > telemetry string > do not allow control characters. Input validation (when storing a string in the telemetry database) has a performance cost, so it could be a compile time debug option, like the memory cookies and mbuf integrity checks. Just a thought.