Hi,
We have a hive table which gets data written to it by two partition keys,
day and hour.
We would like to stream the incoming files assince fileStream can only
listen on one directory we start a streaming job on the latest partition
and every hour kill it and start a new one on a newer partition (We are
also working on migrating the stream from HDFS to Kafka but it will take a
while).

I imagine I'm not the first who tries that, is there a better way to either
stream multiple dirs or change the streaming source location at runtime (or
any other suggestion)?


Thank you.
Daniel

Reply via email to