hello. I wanna know this could be possible or not.
There would be an table which created by create external table test ( date_string String, message String) STORED AS ORC PARTIONED BY (date_string STRING) LOCATION '/message'; with this table I will never add row by 'insert' statement but want to #1. add data of each day to hdfs's partition location directly. e.g /message/20160212 ( by $ hadoop fs -put ) #2. then i will add partition everyday morning. ALTER TABLE test ADD PARTITION (date_string=’20160212’) location '/message/20160212'; #3. query for the added data. with this scenario what or how can I prepare the ORC formatted data in step#1? when stored format is textfile I just need to copy raw file to partition directory, but with orc table I dont think this possible so easily. raw application log is json formatted and each day may have 1M json rows. Actually I do this jobs on my cluster with textfile table not ORC. now I am trying to table format. Any advise would be great. thanks