Bob,

InferAvroSchema can infer types like boolean, integer, long, float, double,
and I believe for JSON can correctly descend into arrays and nested
maps/structs/objects. Here is an example record from NiFi provenance data
that has most of those covered (except bool and float/double, but you can
add those):

{
  "eventId" : "7422645d-056e-423b-b280-6305f9daccaa",
  "eventOrdinal" : 0,
  "eventType" : "CREATE",
  "timestampMillis" : 1496934288944,
  "timestamp" : "2017-06-08T15:04:48.944Z",
  "durationMillis" : -1,
  "lineageStart" : 1496934288930,
  "componentId" : "8821e5d8-015c-1000-30b0-f7211bbf43e5",
  "componentType" : "GenerateFlowFile",
  "componentName" : "_GenerateFlowFile",
  "entityId" : "b99a56c6-e032-4396-915e-24186974b84a",
  "entityType" : "org.apache.nifi.flowfile.FlowFile",
  "entitySize" : 52,
  "updatedAttributes" : {
    "path" : "./",
    "uuid" : "b99a56c6-e032-4396-915e-24186974b84a",
    "filename" : "924304881186293"
  },
  "previousAttributes" : { },
  "actorHostname" : "localhost",
  "contentURI" : "
http://localhost:8989/nifi-api/provenance-events/0/content/output";,
  "previousContentURI" : "
http://localhost:8989/nifi-api/provenance-events/0/content/input";,
  "parentIds" : [ ],
  "childIds" : [ ],
  "platform" : "nifi",
  "application" : "NiFi Flow"
}

 Note that the timestamps are longs as InferAvroSchema does not support
Avro logical types (such as timestamp, date, decimal). I'd like to see an
InferRecordSchema that is record-aware, supports time/date types, etc. I
wrote up a Jira a while back to cover it [1] but haven't gotten around to
implementing it yet.

Regards,
Matt

[1] https://issues.apache.org/jira/browse/NIFI-4109


On Mon, Aug 13, 2018 at 11:02 AM Kuhfahl, Bob <rkuhf...@mitre.org> wrote:

> Trying to develop a sample input file of json data to feed into
> InferAvroSchema so I can feed that into PutDatabaseRecord.
>
> Need a hello world example ☺
>
>
>
> But, to get started, I’d be happy to get InferAvroSchema working.  I’m
> “trial and error”-ing the input file hoping to get lucky, but..
>
>
>
> No log messages, flow of json data is going to failure,  I’m reading the
> code for InferAvroSchema()
>
> But it just calls  JsonUtil.inferSchema(), so I’ll keep digging down the
> path but… if someone has a sample input that demonstrates how it’s supposed
> to work, I’d be grateful!
>
>
>
>
>
>
>
>
>
>
>

Reply via email to