You essentially need to create a dataset in HDFS using the kite-dataset
tool (http://kitesdk.org/docs/1.1.0/cli-reference.html#create). You use
Avro to define your schema, and then you tell Kite that you want the data
to be in Parquet format.

You will use the StoreInKiteDataset processor to write the data. Keep in
mind that the data must be given to the processor as Avro records. You can
use other processors (JSONToAvro, CSVToAvro, etc) to marshal your data into
that format.

*Example:*
*-- Schema (user.avsc)*

{"namespace": "user.avsc",
 "type": "record",
 "name": "User",
 "fields": [
     {"name": "name", "type": "string"},
     {"name": "favorite_number",  "type": ["int", "null"]},
     {"name": "favorite_color", "type": ["string", "null"]}
 ]
}

*-- Create Dataset*

The example below creates a dataset locally on your disk for testing,
however you can sub the location's "file://" URI to "hdfs://" to
specify that you want the dataset in HDFS when you're done testing.

./kite-dataset create users --schema user.avsc --format parquet
--location file:///tmp/parquet_users


On Wed, May 11, 2016 at 1:04 PM, Joe Witt <joe.w...@gmail.com> wrote:

> Hello - can you please register for the dev@nifi.apache.org mailing
> list.  Otherwise I am having to manually approve each email which can
> result in delays.
>
> Just go here to do so: https://nifi.apache.org/mailing_lists.html
>
> Thanks
> Joe
>
> On Wed, May 11, 2016 at 2:45 PM, pradeepbill <pradeep.b...@gmail.com>
> wrote:
> > thanks Ricky, I am a starter here, can you point me to a link please.An
> > example would help greatly.
> >
> >
> >
> >
> > --
> > View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/parquet-format-tp10145p10168.html
> > Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>



-- 
Ricky Saltzer
http://www.cloudera.com

Reply via email to