I have my base data in avro format with below properties
- Number of records 500 million
- Size of data : 400 GB
I tried to evaluate the below options and nothing seemed to be viable
options for me.
- load data into phoenix using phoenix-pig connectors. Time to load 14
hours ( batch size = 200 records)
- Bulk load tool of phoenix ... My data is raw data and essentially I
can't define a delimiter explicitly (If ',' is delimeter, then few of
columns have ',' charecter .. like wise with other delimiters)
- In addition to populate the data in delimited format, I need to run
another MR job .. which I want to avoid
Is there a way to bulk load avro data into phoenix directly.
Regards,
Nagarjuna K