Zibi,
I'm trying to parse and then serialize back the following entity with your
parser and serializer:
<nSpinnerSeconds[@cldr.plural(n)] {
zero: "zero seconds",
one: "one second",
two: "{{ n }} seconds",
few: "{{ n }} seconds",
many: "{{ n }} seconds",
other: "{{ n }} seconds"
}>
The parser gives me this:
{
'$v': {
'many': [{
't': 'id',
'v': 'n'
}, ' seconds'],
'two': [{
't': 'id',
'v': 'n'
}, ' seconds'],
'one': 'one second',
'few': [{
't': 'id',
'v': 'n'
}, ' seconds'],
'zero': 'zero seconds',
'other': [{
't': 'id',
'v': 'n'
}, ' seconds']
},
'$x': [{
'a': [{
't': 'id',
'v': 'n'
}],
't': 'call',
'v': {
't': 'glob',
'v': 'cldr.plural'
}
}],
'$i': 'nSpinnerSeconds'
}
Is this expected? Instead, I was hoping for something more l10n-tool
friendly:
{
"$v": {
"zero": "zero seconds",
"one": "one second",
"two": "{{ n }} seconds",
"few": "{{ n }} seconds",
"many": "{{ n }} seconds",
"other": "{{ n }} seconds",
},
"$i": "nSpinnerSeconds"
}
-Matjaž
On Fri, May 1, 2015 at 2:35 AM, Zibi Braniecki <[email protected]
> wrote:
> Next update!
>
> I got both python [0] and js [1] serializers to work! I can't say they are
> complete, and I don't have tests yet, but from my hand testing they seem
> usable.
>
> I also added ./tools/serialize.js|py to both repositories.
>
> So now I have:
> - two parsers that produce the same JSON AST
> - serializers that can take that AST and reproduce L20n
>
> Which means that we should be able to freely interact between js and
> python and also read/write L20n for tools purposes.
> Axel, I also removed unescape dependency from JS Parser, so you should be
> able to use it in Aisle.
>
> Working on that brought three topics that I so far left unresolved:
>
> 1) Source notation. Currently both parsers don't store any information on
> syntax nodes positioning in the source. I believe it would be worth
> figuring out how we want to handle that. First idea that comes to mind is
> that we could just add a kvp on the node object like 'source': {'start':
> 49, 'end': 102', string: '...'} to use for an editor.
>
> 2) String notations. When a string is used it may be surrounded by ", ' or
> (in the future) """ or '''. Once we parser id, we don't store this
> information so on serialization we cannot reuse it.
>
> We could guess (for example: multiline uses triple-quotes, single line
> uses " unless it has " inside it, and no ' in which case it uses '), but we
> could also somehow store it on the string
>
> 3) Unescaping.
>
> Right now we do something very dummy - we unescape unicode and remove a
> quote from in front of any other character treating the following char as
> non-semantic.
>
> It works well enough, you can do: <foo "hey \" ho"> or <foo "hey \{{ var
> }} ho"> and it will all be stores as a simple string.
>
> But with serialization, problems arise.
>
> First, unicode \uXXXX will be turned into a unicode char by parser so the
> serializer will have no way to figure out what form of unicode has been
> used and will serialize it as a unicode char.
>
> Second, there is no way to sometimes know what unescape form has been
> used. Like:
>
> <foo "hey \{{ var }}"> and <foo "hey {\{ var }}"> will produce the same
> AST. During serialization we can identify that since the ast node is a
> simple string "hey {{ var }}" and not a complex string, we should unescape
> the {{ to remove the syntactic meaning, but we have no way to know which
> char should be unescaped.
>
> Third, all other chars just escaped, so <foo "hey \n"> will be turned into
> "hey n" and <foo "hey \l"> will be turned into <foo "hey l">
>
> That means that when serializing we will just write it back without a
> backslash.
>
> We can limit the backslash use, and raise errors in parser if \ precedes
> an unknown char, and then have rules in the serializer, to backslash a
> backslash, backslash {{ and backslash string closing mark, but for chars
> like "\n" we will hit the same problem as with unicode:
>
> <foo "hey
> ho"> and <foo hey \n ho"> will produce the same AST. What should we
> serialize it into?
>
> Would love to get your feedback!
> zb.
>
> [0]
> https://github.com/l20n/python-l20n/blob/master/lib/l20n/format/serializer.py
> [1]
> https://github.com/zbraniecki/l20n.js/blob/v3-features/src/lib/format/l20n/serializer.js
> _______________________________________________
> tools-l10n mailing list
> [email protected]
> https://lists.mozilla.org/listinfo/tools-l10n
>
_______________________________________________
tools-l10n mailing list
[email protected]
https://lists.mozilla.org/listinfo/tools-l10n