Hi Jan,
If I understood the taocpp PEGTL correctly, it might be something like this:
struct extended_offset : sor< Z, seq< sign, zhh, opt< opt<colon>, zmm > > > {};
struct basic_offset : sor< Z, seq< sign, zhh, opt< opt<colon>, zmm > > > {};
With some additional tests to include something like this:
"2020-03-24T04:59:41+0000"
"2020-03-24T04:59:41+00:00"
"2020-03-24T04:59:41+00"
"2020-03-24T04:59:41"
"2020-03-24T04:59:41Z+0000"
"2020-03-24T04:59:41Z+00:00"
"2020-03-24T04:59:41Z+00"
"2020-03-24T04:59:41Z"
"2020-03-24T04:59:41-0000"
"2020-03-24T04:59:41-00:00"
"2020-03-24T04:59:41-00"
"2020-03-24T04:59:41"
"2020-03-24T04:59:41Z-0000"
"2020-03-24T04:59:41Z-00:00"
"2020-03-24T04:59:41Z-00"
"2020-03-24T04:59:41Z"
Sorry it's my first time looking at taocpp PEGTL, but if it helps, I could
express it pretty concisely in a Regular Expression.
https://rubular.com/r/15kzhvvls0lD3E
(only for the offset portion, and it may still yet be incomplete)
I'd recommend looking at ICU
https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/classicu_1_1DateFormat.html
https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/classicu_1_1DateFormat.html#details
https://github.com/unicode-org/icu/blob/d3315d98ef82b09aabef12e3f7fb46d171d8bc32/icu4c/source/i18n/datefmt.cpp
icu/icu4c/source/i18n/datefmt.cpp
Theirs is the canonical and reference implementation of everything date time
and unicode.
It's probably a parser only expecting a short string for a date, so you would
either want to call their function if you can determine the terminals for a
possible date string in your parsing, or just try to examine their logic and
their tests. Their tests should be informative.
Unfortunately, it's an absolutely massive code base that takes time to dig
through, but Github's search helps…
> On Mar 31, 2020, at 13:18, [email protected] wrote:
>
> Hi Jan
>
> Thanks for confirming and thanks for the code reference. Very helpful to
> know.
> Might be good to note in the docs How conformant it is or not.
> If I have time I will peek at the code and see if I can contribute.
>
> I wonder if some cases might be well covered already by ICU4C ?
>
>>
>> On Mar 31, 2020, at 13:07, Jan Uhde <[email protected]> wrote:
>>
>> While the ArangoDB code is not 100% ISO conform it should be correct there.
>> I think the ISO states that you should b use the extended format or not.
>> Therefore is should not be possible set the colon at dinner points but not
>> at others. Like it is done in your example.
>>
>> To get an idea please look at this unfinished code:
>>
>> https://github.com/ObiWahn/PEGTL/commit/95a825326d734ff035c0979b2019695befda20b0#diff-88660bc76de9dabacba18430927dfcd1
>>
>> The relevant ArangoDB code can be found here:
>> https://github.com/arangodb/arangodb/blob/devel/lib/Basics/datetime.cpp
>>
>> Reading the code is your best shot to understand what will work and what
>> does not. Because this does not follow closely any standard.
>>
>> At some point I wanted to make it ISO conform but it was decided that it is
>> not worth the trouble and we do not want to change what is currently
>> supported.
>>
>> If you are interested you can help in my freetime effort to finish the PEG
>> grammar in above repository. It might be added to ArangoDB because we use
>> PEGTL in other places by now. This was not the case when I was looking at
>> the problem last time. Therefore an additional library needed to be added in
>> the past resulting in more resistance.
>>
>>
>> --
>> You received this message because you are subscribed to a topic in the
>> Google Groups "ArangoDB" group.
>> To unsubscribe from this topic, visit
>> https://groups.google.com/d/topic/arangodb/TWrmSbJq0vY/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to
>> [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/arangodb/8e857aa3-fd7a-4a30-a4ff-b1ef208015f0%40googlegroups.com.
--
You received this message because you are subscribed to the Google Groups
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/arangodb/6B95BC41-B4BE-4E74-9943-573B38A4279B%40gmail.com.