On Wed, Jun 22, 2022 at 10:19:48AM +0100, Bruce Richardson wrote: > On Wed, Jun 22, 2022 at 08:57:43AM +0100, Power, Ciara wrote: > > Hi folks, > > > > > -----Original Message----- > > > From: Morten Brørup <m...@smartsharesystems.com> > > > Sent: Saturday 18 June 2022 10:59 > > > To: fengchengwen <fengcheng...@huawei.com>; Stephen Hemminger > > > <step...@networkplumber.org>; Richardson, Bruce > > > <bruce.richard...@intel.com> > > > Cc: tho...@monjalon.net; ferruh.yi...@xilinx.com; Laatz, Kevin > > > <kevin.la...@intel.com>; andrew.rybche...@oktetlabs.ru; > > > jer...@marvell.com; sachin.sax...@oss.nxp.com; > > > hemant.agra...@nxp.com; dev@dpdk.org; Power, Ciara > > > <ciara.po...@intel.com> > > > Subject: RE: [PATCH v2 1/5] telemetry: escape special char when tel string > > > > > > +CC: Ciara Power, Telemetry library maintainer > > > > > > > From: fengchengwen [mailto:fengcheng...@huawei.com] > > > > Sent: Saturday, 18 June 2022 05.52 > > > > > > > > On 2022/6/18 1:05, Stephen Hemminger wrote: > > > > > On Fri, 17 Jun 2022 12:25:04 +0100 > > > > > Bruce Richardson <bruce.richard...@intel.com> wrote: > > > > > > > > > >> On Fri, Jun 17, 2022 at 01:16:08PM +0200, Morten Brørup wrote: > > > > >>>> From: Chengwen Feng [mailto:fengcheng...@huawei.com] > > > > >>>> Sent: Friday, 17 June 2022 11.46 > > > > >>>> > > > > >>>> This patch supports escape special characters (including: > > > > \",\\,/,\b, > > > > >>>> /f,/n,/r,/t) when telemetry string. > > > > >>>> This patch is used to support telemetry xxx-dump commands which > > > > the > > > > >>>> string may include special characters. > > > > >>>> > > > > >>>> Signed-off-by: Chengwen Feng <fengcheng...@huawei.com> > > > > >>>> --- > > > > >>>> lib/telemetry/telemetry.c | 96 > > > > +++++++++++++++++++++++++++++++++++++-- > > > > >>>> 1 file changed, 93 insertions(+), 3 deletions(-) > > > > >>>> > > > > >>>> diff --git a/lib/telemetry/telemetry.c > > > > >>>> b/lib/telemetry/telemetry.c index c6fd03a5ab..0f762f633e 100644 > > > > >>>> --- a/lib/telemetry/telemetry.c > > > > >>>> +++ b/lib/telemetry/telemetry.c > > > > >>>> @@ -215,6 +215,94 @@ container_to_json(const struct rte_tel_data > > > > *d, > > > > >>>> char *out_buf, size_t buf_len) > > > > >>>> return used; > > > > >>>> } > > > > >>>> > > > > >>>> +static bool > > > > >>>> +json_is_special_char(char ch) > > > > >>>> +{ > > > > >>>> + static unsigned char is_spec[256] = { 0 }; > > > > >>>> + static bool init_once; > > > > >>>> + > > > > >>>> + if (!init_once) { > > > > >>>> + is_spec['\"'] = 1; > > > > >>>> + is_spec['\\'] = 1; > > > > >>>> + is_spec['/'] = 1; > > > > >>>> + is_spec['\b'] = 1; > > > > >>>> + is_spec['\f'] = 1; > > > > >>>> + is_spec['\n'] = 1; > > > > >>>> + is_spec['\r'] = 1; > > > > >>>> + is_spec['\t'] = 1; > > > > >>>> + init_once = true; > > > > >>>> + } > > > > >>>> + > > > > >>>> + return (bool)is_spec[(unsigned char)ch]; } > > > > >> > > > > >> According to the json spec at [1], the characters that need to be > > > > escaped > > > > >> are: > > > > >> a) any characters <0x20 > > > > >> b) inverted commas/quote character \" > > > > >> c) the "reverse solidus character", better known to you and I as > > > > >> the back-slash. > > > > >> > > > > >> Therefore, I think this table generation could be simplified, but > > > > also > > > > >> expanded using this. For completeness we should also see about > > > > handling all > > > > >> control characters if they are encountered. > > > > >> > > > > >> [1] https://www.rfc-editor.org/rfc/rfc8259.txt > > > > >> > > > > >> /Bruce > > > > > > > > > > Since it is trivial could be initializer? > > > > > > > > > > static const uint8_t is_spec[256] = { > > > > > [0 ... 0x20] = 1, > > > > > ['\"' ] = 1, > > > > > ['\\' ] = 1, > > > > > ['/'] = 1, > > > > > > > > > > etc > > > > > > > > > > Or we could change the telemetry API to disallow control characters? > > > > > > > > I was thinking about converting 0~0x20, but I don't think there's a > > > > scenario. > > > > > > > > I prefer change the telemetry API to disallow control characters, and > > > > this may not be a violation of the ABI, if yes, the dpdk-telemetry.py > > > > will returns an error. > > > > > > I agree with Chengwen Feng. The telemetry data type is STRING, not BLOB. > > > > > > So we need to define exactly what the STRING type contains. > > > > > > I hope we can all agree that control characters should be disallowed. > > > > > > The more complicated question is: Do we want to use the ASCII character > > > set > > > only, or do we want to use UTF-8 encoded Unicode? > > > > > > Personally, think UTF-8 encoded Unicode is more future proof, and would > > > vote for that. > > > > > > But I wouldn't reject limiting it to ASCII, and perhaps in the future > > > introduce > > > another data type for UTF-8 strings. > > > > > > UTF-8 is the modern choice, but it is incompatible with old stuff, e.g. > > > many > > > SNMP MIBs. > > > > > [CP] > > > > Just from looking at the spec [1] , I would say UTF-8, as it seems to > > suggest its use for JSON (section 8.1). > > > > [1] https://www.rfc-editor.org/rfc/rfc8259.txt > > > > > > > > > > So I think we could add declaring and checking functions to make sure > > > > telemetry string do not allow control characters. > > [CP] > > > > I am not sure why we don't want these at all - I thought we do want some of > > them, like tab (\u0009) for example. > > > > <snip> > > > > In general, I think Bruce's suggestion of using a customised printf > > function instead of snprintf would be a good way forward, to scan the chars > > as they are being copied in. > > > > I'm hoping to have some time to try and prototype this myself soon, and > send out a draft patch to this mailing list for consideration. > Here is an RFC of my current prototype of this:
http://patches.dpdk.org/project/dpdk/list/?series=23739 Feedback welcome. Regards, /Bruce