Shwetha, Venkatesh,

Thanks for the clarification. Using a Hive script instead of Java APIs is 
actually quite clever. It avoids an API dependency. 

The only limitation of using Hive's Exim functionality is that it might not be 
forward compatible. (For instance, metadata exports from Hive 14 might not be 
deserializable from 13.) That might pose a limitation on the direction in which 
data is replicated.

Thanks for your time.

Mithun


On Wednesday, June 11, 2014 1:03 PM, Seetharam Venkatesh 
<[email protected]> wrote:
 


Mithun,

> I couldn't find the Falcon code that exports the HCat/Hive metadata to HDFS.
Hive partition replication is part of the oozie workflow. 
$incubator-falcon/feed/src/main/resources/config/workflow/replication-workflow.xml



> The table/partition metadata might currently be serialized to HDFS in thrift 
>(I'll have to check). 
The table metadata is serialized as XML and since it uses hive's internal APIs 
(EXIM), the compatibility is taken care of. 




On Tue, Jun 10, 2014 at 9:59 PM, Shwetha GS <[email protected]> wrote:

Hi Mithun,
>
>1. Table export is done using hive export command which is part of hive
>action in oozie replication workflow:
>https://github.com/apache/incubator-falcon/blob/master/feed/src/main/resources/config/workflow/falcon-table-export.hql
>
>2. Yes, falcon assumes that hive at source and target are compatible. Do
>you see any issues?
>
>-Shwetha
>
>
>
>On Wed, Jun 11, 2014 at 1:59 AM, Mithun Radhakrishnan <
>[email protected]> wrote:
>
>> Greetings, Falcon-dev.
>>
>> I've a n00b question about Falcon's support for HCatalog partition-import.
>> My (incomplete) understanding is that the implementation copies data
>> alongside the serialized metadata, and resolves the partition-schema on the
>> target cluster.
>>
>> 1. I couldn't find the Falcon code that exports the HCat/Hive metadata to
>> HDFS. I expected that org.apache.hadoop.hive.ql.parse.EximUtil might be
>> used for this, but there's no reference to this class in the Falcon master
>> branch. Might I please enquire where/how that's done? A pointer to code
>> would be ideal, thanks.
>>
>> 2. The table/partition metadata might currently be serialized to HDFS in
>> thrift (I'll have to check). Does Falcon currently assume that the Hive
>> versions running on the source and target clusters are compatible? (i.e.
>> That the metadata can be imported on the target?)
>>
>> Thanks,
>> Mithun
>
>--
>_____________________________________________________________
>The information contained in this communication is intended solely for the
>use of the individual or entity to whom it is addressed and others
>authorized to receive it. It may contain confidential or legally privileged
>information. If you are not the intended recipient you are hereby notified
>that any disclosure, copying, distribution or taking any action in reliance
>on the contents of this information is strictly prohibited and may be
>unlawful. If you have received this communication in error, please notify
>us immediately by responding to this email and then delete it from your
>system. The firm is neither liable for the proper and complete transmission
>of the information contained in this communication nor for any delay in its
>receipt.
>


-- 

Regards,
Venkatesh

“Perfection (in design) is achieved not when there is nothing more to add, but 
rather when there is nothing more to take away.” 

- Antoine de Saint-Exupéry

Reply via email to