Object registry is a user enabled feature provided by Tez to the application (e.g. Hive and Pig) If the application chooses to use this, then it can do some user land caching across tasks/vertices/dags using it. E.g. hive caches the smaller broadcast side of a broadcast join in the shared object registry.
Object registry is not an automatic data caching or input caching mechanism. What application/job are you running? Hive/Pig/Custom? Unless the application (like Hive) has used object caching for a cross dag scenario (which AFAIK it does not) you will not see any difference. If its custom then you will have to explicitly use object registry in a manner that makes sense for your app. -----Original Message----- From: Raajay [mailto:[email protected]] Sent: Tuesday, December 1, 2015 10:36 AM To: [email protected] Subject: Shared object registry How to effectively use shared object registry? I created a tez client as a session, and submitted a dag twice sequentially. However, i did not see noticeable difference in their run times. They query was tpcds query#3. I had set enable container reuse in tez-site.xml. Are there other configs i need to ensure are set correctly to use shares objects? - Raajay
