Re: "Flatten" JSON

Kevin Doran Fri, 15 Sep 2017 06:34:13 -0700

+1 for adding a FlattenRecord processor. I can think of a few scenarios in 
which it would be quite useful, and it would be convenient if it could be 
accomplished without JOLT.


Thanks,
Kevin

On 9/15/17, 09:16, "Nicholas Hughes" <nicholasmhug...@gmail.com on behalf of 
nicholasmhughes.n...@gmail.com> wrote:

    Mark,
    
    I'm definitely for making the processor as generic as possible. I don't
    mind chaining together a few simple processors to get a job done (such as
    convert JSON to Avro > infer schema > flatten records)... I just don't want
    steps get super complex... and the Jolt Transform processor does seem very
    powerful and very complex.
    
    If there's some support for a "FlattenRecord" processor, I can submit the
    Jira containing the meat of this thread.
    
    -Nick
    
    
    On Fri, Sep 15, 2017 at 9:01 AM, Mark Payne <marka...@hotmail.com> wrote:
    
    > Nick,
    >
    > I do believe that there's a way to do what you're asking with Jolt,
    > without knowing any kind of schema.
    > That said, Jolt can get complex pretty quickly and I don't know it well
    > :)  Personally, I have no problem with having a
    > FlattenRecord processor. I guess the question here, though, is are you
    > using Record-oriented processors,
    > or are you using JSON-specific processors?
    >
    > Personally, I'd like to see a FlattenRecord processor, rather than
    > FlattenJSON, because that would allow
    > the transformation to apply to Avro as well (and as soon as we get an XML
    > reader built, XML also). However,
    > the Record-oriented processors would expect that a schema be given (though
    > it could also be inferred using
    > another existing processor).
    >
    > -Mark
    >
    >
    >
    > > On Sep 15, 2017, at 7:43 AM, Nicholas Hughes <
    > nicholasmhughes.n...@gmail.com> wrote:
    > >
    > > Is there an easy way to "flatten" arbitrary JSON within NiFi?
    > >
    > > For input data like that shown below from Yahoo [1]
    > >
    > > {
    > >  "query": {
    > >    "count": 1,
    > >    "created": "2017-09-15T11:20:26Z",
    > >    "lang": "en-US",
    > >    "results": {
    > >      "channel": {
    > >        "item": {
    > >          "condition": {
    > >            "code": "33",
    > >            "date": "Fri, 15 Sep 2017 06:00 AM EDT",
    > >            "temp": "63",
    > >            "text": "Mostly Clear"
    > >          }
    > >        }
    > >      }
    > >    }
    > >  }
    > > }
    > >
    > >
    > > ...I'd like to end up with output something like this:
    > >
    > > {
    > >  "query.count": 1,
    > >  "query.created": "2017-09-15T11:20:26Z",
    > >  "query.lang": "en-US",
    > >  "query.results.channel.item.condition.code": "33",
    > >  "query.results.channel.item.condition.date": "Fri, 15 Sep 2017 06:00
    > AM EDT",
    > >  "query.results.channel.item.condition.temp": "63",
    > >  "query.results.channel.item.condition.text": "Mostly Clear"
    > > }
    > >
    > >
    > > I checked out the JoltTransformJSON processor and some examples, such as
    > > the nested data to "prefix soup" demo [2], but it seems as though I need
    > to
    > > enter information about the schema for the incoming data in order to
    > > transform it. Ideally, I'd like to have a processor "just figure it out"
    > > without explicit entry of a schema.
    > >
    > > Is there any way to accomplish this in a generic way with
    > JoltTransformJSON
    > > (or another native processor)?
    > >
    > > If not, would a ticket requesting a "Field Flattener" processor much 
like
    > > the one included in StreamSets Data Collector [3] be worthwhile?
    > >
    > > Thanks in advance!
    > >
    > > -Nick
    > >
    > >
    > > [1]
    > > https://query.yahooapis.com/v1/public/yql?q=select%20item.
    > condition%20from%20weather.forecast%20where%20woeid%20%
    > 3D%202383558&format=json&env=store%3A%2F%2Fdatatables.org%
    > 2Falltableswithkeys
    > >
    > > [2] http://jolt-demo.appspot.com/#bucketToPrefixSoup
    > >
    > > [3]
    > > https://github.com/streamsets/datacollector/tree/master/
    > basic-lib/src/main/java/com/streamsets/pipeline/stage/
    > processor/fieldflattener
    >
    >

Re: "Flatten" JSON

Reply via email to