[jira] [Comment Edited] (SPARK-16864) Comprehensive version info
[ https://issues.apache.org/jira/browse/SPARK-16864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410596#comment-15410596 ] Jan Gorecki edited comment on SPARK-16864 at 8/6/16 11:50 AM: -- Record exact spark source code reference while processing ETL workflow so performance implication can be measures precisely referencing point in time of source code. I doubt if version number or date/time is a natural key for spark source code, is it? If you don't have a natural key you can't build reliable workflow. How would you automatically git clone, reset, build, deploy and re-run your workflow - based on data collected by spark - if you don't even have git commit there? Lookup git commit hash by version and date... sure it works, but why users can't just access that info directly? I don't see ANY reason to not have that feature. If you have any I would be glad to read. And no, even for developers that info is not available on runtime. was (Author: jangorecki): Record exact spark source code reference while processing ETL workflow so performance implication can be measures precisely referencing point in time of source code. I doubt if version number or date/time is a natural key for spark source code, is it? If you don't have a natural key you can't build reliable workflow. How would you automatically git clone, reset, build, deploy and re-run your workflow - based on data collected by spark - if you don't even have git commit there? Lookup git commit hash by version and date... sure it works, but why users can't just access that info directly? I don't see ANY reason to not have that feature? If you have any I would be glad to read. And no, even for developers that info is not available on runtime. > Comprehensive version info > --- > > Key: SPARK-16864 > URL: https://issues.apache.org/jira/browse/SPARK-16864 > Project: Spark > Issue Type: Improvement >Reporter: jay vyas > > Spark versions can be grepped out of the Spark banner that comes up on > startup, but otherwise, there is no programmatic/reliable way to get version > information. > Also there is no git commit id, etc. So precise version checking isnt > possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16864) Comprehensive version info
[ https://issues.apache.org/jira/browse/SPARK-16864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410238#comment-15410238 ] Jan Gorecki edited comment on SPARK-16864 at 8/5/16 10:52 PM: -- Hi, git commit is relevant to applications at runtime as long as the subject for an application (in any dimension) is spark itself. I don't understand why that info would not be included. This may not be a problem for people who build from source, they can eventually put that metadata in plaintext file (still an overhead). The bigger problem is for those who grab binaries and for example just want to track performance in their cluster over spark git history. Git commit hash is a natural key for a source code of a project, you won't find better field to reference source code. Referencing release version is a different thing. was (Author: jangorecki): Hi, git commit is relevant to applications at runtime as long as the subject for an application (in any dimension) is spark itself. I don't understand why that info would not be included. This may not be a problem for people who build from source, they can eventually put that metadata in plaintext file (still an overhead). The bigger problem is for those who grab binaries and for example just want to track performance in their cluster over spark git history. Git commit hash is a natural key for a source code a project, you won't find better field to references the source code. Referencing release versions is simply a different thing. > Comprehensive version info > --- > > Key: SPARK-16864 > URL: https://issues.apache.org/jira/browse/SPARK-16864 > Project: Spark > Issue Type: Improvement >Reporter: jay vyas > > Spark versions can be grepped out of the Spark banner that comes up on > startup, but otherwise, there is no programmatic/reliable way to get version > information. > Also there is no git commit id, etc. So precise version checking isnt > possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-16864) Comprehensive version info
[ https://issues.apache.org/jira/browse/SPARK-16864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410238#comment-15410238 ] Jan Gorecki edited comment on SPARK-16864 at 8/5/16 10:51 PM: -- Hi, git commit is relevant to applications at runtime as long as the subject for an application (in any dimension) is spark itself. I don't understand why that info would not be included. This may not be a problem for people who build from source, they can eventually put that metadata in plaintext file (still an overhead). The bigger problem is for those who grab binaries and for example just want to track performance in their cluster over spark git history. Git commit hash is a natural key for a source code a project, you won't find better field to references the source code. Referencing release versions is simply a different thing. was (Author: jangorecki): Hi, git commit is relevant to applications at runtime as long as the subject for an application (in any dimension) is spark itself. I don't understand why that info would not be included. This may not be a problem for people who build from source, they can eventually put that metadata in plaintext file (still an overhead). The bigger problem is for those who just grab binaries and for example just want to track performance in their cluster over spark git history. Git commit hash is a natural key for a source code a project, you won't find better field to references the source code. Referencing release versions is simply a different thing. > Comprehensive version info > --- > > Key: SPARK-16864 > URL: https://issues.apache.org/jira/browse/SPARK-16864 > Project: Spark > Issue Type: Improvement >Reporter: jay vyas > > Spark versions can be grepped out of the Spark banner that comes up on > startup, but otherwise, there is no programmatic/reliable way to get version > information. > Also there is no git commit id, etc. So precise version checking isnt > possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org