Jesang Yoon created ZEPPELIN-1135: ------------------------------------- Summary: Provide a manifest for data & interface to use it Key: ZEPPELIN-1135 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1135 Project: Zeppelin Issue Type: New Feature Components: documentation, GUI, zeppelin-interpreter Affects Versions: 0.7.0 Reporter: Jesang Yoon Priority: Minor
While using various data at various sources (difference URLs) to run a mixed data analysis via zeppelin, my team encounter problem with manging many different data source URLs and share between teammates. So I propose a idea to solve this problem by providing "manifest of data and interface to use it" and want to build consensus between contributors and PPMC before build and commit a code. h4. Pain points * Files or resources tend to be displaced to various location. (HDFS, Web, etc...) * It's bit complicated to remember & identify location of data and use a long URL for it. * URL for data is not enough to describe what is inside of it. h4. How to resolve it # Define a format of web based document(XML/JSON/YAML) contains manifest(or meta) of data that can be used by team. #* Title of data #* Location of data (URL) #* Description of data #* Tags of data (for search) # Build a zeppelin interface function to search & view description of data described at 1. # Build a zeppelin interface function to return a real location of data captured at 2. to using with load() functions of various interpreters. h4. Effects * Able to share single clean and neat information about data between teammates. * Do not have to follow & change all URLs in notebooks when location of data has been modified. * Easy to search and use data in analysis codes. Please review this idea and give comments :) -- This message was sent by Atlassian JIRA (v6.3.4#6332)