Object registry is a user enabled feature provided by Tez to the application
(e.g. Hive and Pig) If the application chooses to use this, then it can do
some user land caching across tasks/vertices/dags using it. E.g. hive caches
the smaller broadcast side of a broadcast join in the shared object
registry.

Object registry is not an automatic data caching or input caching mechanism.

What application/job are you running? Hive/Pig/Custom? Unless the
application (like Hive) has used object caching for a cross dag scenario
(which AFAIK it does not) you will not see any difference. If its custom
then you will have to explicitly use object registry in a manner that makes
sense for your app. 


-----Original Message-----
From: Raajay [mailto:[email protected]] 
Sent: Tuesday, December 1, 2015 10:36 AM
To: [email protected]
Subject: Shared object registry

How to effectively use shared object registry? 

I created a tez client as a session, and submitted a dag twice sequentially.


However, i did not see noticeable difference in their run times. They query
was tpcds query#3. 

I had set enable container reuse in tez-site.xml. Are there other configs i
need to ensure are set correctly to use shares objects?

- Raajay


Reply via email to