On Monday, September 20, 2010, Bennie Schut <bsc...@ebuddy.com> wrote:
>
>
>
>
>
>
>
>
>
>
>
>
>
> Hi all,
>
>
>
> We are sometimes getting file not found exceptions while
> running large queries on hive. During these large queries we also import data
> on the partitions we are querying which raises a question for us. How does 
> hive
> handle data which is being modified in the background?
>
> We use insert overwrite on the partitions so I can imagine
> the large query can be surprised with some new files and some missing old
> files.
>
> If others are experiencing this how do they work around
> this? Perhaps partition on 2 keys so you don’t overwrite existing data?
>
>
>
> Thanks for any pointers on this.
>
> Bennie.
>
>
>
>
>
>
>

I do think hive/map reduce have a great way of dealing with moving
targets. if the content changes between get splits and task execution.

We have a program file crush which crushes up small files. We
implemented read and write locks on a table basis. I am sure the new
zk locking might handles this better.

Reply via email to