We have to distinguish between the json representation of a feature as defined in our "specification" and the additional support of our maven plugin. A feature must have an ID and we don't allow interpolation of placeholders within the feature. Therefore the json code that we have to read a feature in the feature-io module works on this basis. And we should not change that.

However, for maven based projects, the maven plugin allows to leave out the id; it then gets calculated based on the project coordinates and the file name. And the maven plugin also allows to interpolate placeholders in the feature file.

Now in a maven project, the above two things need to happen before validation, otherwise the validation might fail.

If we can make these steps more efficient, great - but we must not break the separation of the functionality.

In addition, do we have any numbers that claim how "slow" this currently is and how much we can improve this? Feature files are rather small and all processing happens in memory. In addition we're talking about a build time tool here, so if it spends some extra milliseconds this doesn't really matter.

Again, if we can improve, let's do it - but I think it's not our most urgent problem to fix wrt the feature model

Regards
Carsten

Am 13.11.2018 um 02:53 schrieb Justin Edelson:
Hi Simo,
Take this with several grains of salt as I don't know the internals of the
feature processing, but just looking at your email from a generic "how do I
process a JSON file" it still seems inefficient.

Ideally, IMO, the substitution would be done as a filter applied to the
stream of parser events. That way the entire String is not held in memory
-- only the parsed DOM. I suspect it is also "safer" in the sense that you
can more tightly control the context in which interpolation occurs (for
example, interpolation should be allowed in string values, but not keys);
the flip side is that it also is more restrictive, i.e. supporting
interpolation of non-String values would be non-trivial (then again, doing
this would make the original document invalid JSON so I'm not sure this is
a real use case). I would suggest taking a look at Jackson's
JsonParserDelegate.

Regards,
Justin

On Mon, Nov 12, 2018 at 2:46 PM Simone Tripodi <simonetrip...@apache.org>
wrote:

Hi all mates,

during the last couple of months the work we've been doing on Feature
files processing is HUGE, so the iterations to refine the pipeline
process introduced some "overhead" operations we can improve, what we
currently do is:

  * the pre processor starts by reading the whole file to memory,
storing it in a String reference;
  * parse the JSON file to create the javax.json DOM and check the `id`
property is missing, adding it if necessary and then serializing it to
string again:
  * JSON Schema validation takes the string as input, creates the
Jackson DOM to validate it against the defined schema;
  * if schema validation is OK, the Substitution takes the JSON string
as input to interpolate variables, which creates a new JSON string
representation;
  * the JS Min takes the JSON string representation and converts it to
a new JSON string representation where useless stuff are removed;
  * at that point, the JSON Feature reader takes the final string and
creates a javax.json DOM once again to map it to a Feature instance.

My proposal is improving a little our pipeline in order to speed up
the JSON processing in that way:

  * the JS Min starts by reading the whole file to memory, storing it
in a String reference;
  * the Substitution takes the JSON string as input to interpolate
variables, which creates a new JSON string representation;
  * a Jackson DOM will be created in order to check the `id` property
is missing, adding it if necessary;
  * the Jackson DOM will be validated against the defined schema;
  * the Jackson DOM will be mapped to a Feature instance.

WDYT?

Many thanks in advance!
~Simo

http://people.apache.org/~simonetripodi/
http://twitter.com/simonetripodi



--
Carsten Ziegeler
Adobe Research Switzerland
cziege...@apache.org

Reply via email to