You may want to reach out to hdfs dev for the format of editlog. There is a
lot of information there and I am not sure how accurate I am.

In one of my previous works, we did convert the daily editlog to a
partitioned hive table and did exactly what you wanted to do.
Sadly we could not opensource that product.


On Tue, Oct 8, 2013 at 12:26 AM, demian rosas <demia...@gmail.com> wrote:

> Edward,
>
> Thanks a lot for this info !!!
>
> This gives me a clearer picture of the problem and how I can approach it.
>
> Cheers.
>
>
> On 7 October 2013 11:52, Edward Capriolo <edlinuxg...@gmail.com> wrote:
>
>> Not a direct API.
>>
>> What I do is this. From java/thrift:
>> Table t = client.getTable("name_of_table");
>> Path p = new Path(t.getSd.getLocation());
>> FileSystem fs = FileSystem.get(conf);
>> List<FileStatus> f = fs.listFiles(p)
>> /// your logic here.
>>
>>
>>
>>
>> On Mon, Oct 7, 2013 at 2:01 PM, demian rosas <demia...@gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I want to track the changes made to the files of a Hive table.
>>>
>>> I wounder whether there is any API that I can use to find out the
>>> following:
>>>
>>> 1. What files in hdfs constitute a hive table.
>>> 2. What is the size of each of these files.
>>> 3. The time stamp of the creation/last update to each of these files.
>>>
>>>
>>> Also in a wider view, is there any API that can do the above mentioned
>>> for HDFS files in general (not only hive specific)?
>>>
>>> Thanks a lot in advance.
>>>
>>> Cheers.
>>>
>>>
>>>
>>
>


-- 
Nitin Pawar

Reply via email to