Hi Goel.
Welcome. Great thoughts and use cases. I will let others chime in here, but
one question would be "what is your "process"? ". For example, is it HiveQL?
SQOOP? Something for which there exists today a supported "hook"? (see the
atlas.apache.org page with the high level specs, api, and hooks list). ....and
are your datasets known to Atlas (hive, for example), or something external to
Atlas?
This would help outline what can work today directly, or possibly need custom
work...
There is great work being done by the team here to further enhance the
underlying models (as you can see thru recent jira's)....that will facilitate
even wider creative use cases.
Looking forward to the details on your "process" and dataset object types.
Ernie
Sent from IBM Verse
Rajat Goel --- Regarding pipeline of multiple processes and ATLAS-1236 ---
From:"Rajat Goel" <[email protected]>To:[email protected]:Thu,
Aug 24, 2017 2:06 PMSubject:Regarding pipeline of multiple processes and
ATLAS-1236
Hi,I am new user and exploring Apache Atlas for metadata management. I have
ause case where I want to track lineage across a pipeline of
processingfunctions i.e. DataSet 1 -> Function/Process1 -> Process2 -> Process
3 ->DataSet 2.Another use case is that each of the above Processes/Functions
themselvescould be some named pipelines. Eg. DataSet1 ->Process1 (which is
pipeline of SubProcess1 -> SubProcess2) -> Process2 ->DataSet2.Can Apache Atlas
support these use cases for metadata management andlineage?If yes, please
suggest how. If not, is there any plan to support these infuture?I found one
Jira improvement ticket ATLAS-1236 which looks to be relevantto the above use
cases. Is there any plan to resolve it in upcomingreleases?Thanks &
Regards,Rajat Goel