Re: "Flatten" JSON

Mark Payne Fri, 15 Sep 2017 06:02:19 -0700

Nick,

I do believe that there's a way to do what you're asking with Jolt, without 
knowing any kind of schema.
That said, Jolt can get complex pretty quickly and I don't know it well :)  
Personally, I have no problem with having a
FlattenRecord processor. I guess the question here, though, is are you using 
Record-oriented processors,
or are you using JSON-specific processors?


Personally, I'd like to see a FlattenRecord processor, rather than FlattenJSON, 
because that would allow
the transformation to apply to Avro as well (and as soon as we get an XML 
reader built, XML also). However,
the Record-oriented processors would expect that a schema be given (though it 
could also be inferred using
another existing processor).

-Mark



> On Sep 15, 2017, at 7:43 AM, Nicholas Hughes <nicholasmhughes.n...@gmail.com> 
> wrote:
> 
> Is there an easy way to "flatten" arbitrary JSON within NiFi?
> 
> For input data like that shown below from Yahoo [1]
> 
> {
>  "query": {
>    "count": 1,
>    "created": "2017-09-15T11:20:26Z",
>    "lang": "en-US",
>    "results": {
>      "channel": {
>        "item": {
>          "condition": {
>            "code": "33",
>            "date": "Fri, 15 Sep 2017 06:00 AM EDT",
>            "temp": "63",
>            "text": "Mostly Clear"
>          }
>        }
>      }
>    }
>  }
> }
> 
> 
> ...I'd like to end up with output something like this:
> 
> {
>  "query.count": 1,
>  "query.created": "2017-09-15T11:20:26Z",
>  "query.lang": "en-US",
>  "query.results.channel.item.condition.code": "33",
>  "query.results.channel.item.condition.date": "Fri, 15 Sep 2017 06:00 AM EDT",
>  "query.results.channel.item.condition.temp": "63",
>  "query.results.channel.item.condition.text": "Mostly Clear"
> }
> 
> 
> I checked out the JoltTransformJSON processor and some examples, such as
> the nested data to "prefix soup" demo [2], but it seems as though I need to
> enter information about the schema for the incoming data in order to
> transform it. Ideally, I'd like to have a processor "just figure it out"
> without explicit entry of a schema.
> 
> Is there any way to accomplish this in a generic way with JoltTransformJSON
> (or another native processor)?
> 
> If not, would a ticket requesting a "Field Flattener" processor much like
> the one included in StreamSets Data Collector [3] be worthwhile?
> 
> Thanks in advance!
> 
> -Nick
> 
> 
> [1]
> https://query.yahooapis.com/v1/public/yql?q=select%20item.condition%20from%20weather.forecast%20where%20woeid%20%3D%202383558&format=json&env=store%3A%2F%2Fdatatables.org%2Falltableswithkeys
> 
> [2] http://jolt-demo.appspot.com/#bucketToPrefixSoup
> 
> [3]
> https://github.com/streamsets/datacollector/tree/master/basic-lib/src/main/java/com/streamsets/pipeline/stage/processor/fieldflattener

Re: "Flatten" JSON

Reply via email to