[jira] [Comment Edited] (SPARK-16864) Comprehensive version info

2016-08-06 Thread Jan Gorecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410596#comment-15410596
 ] 

Jan Gorecki edited comment on SPARK-16864 at 8/6/16 11:50 AM:
--

Record exact spark source code reference while processing ETL workflow so 
performance implication can be measures precisely referencing point in time of 
source code. I doubt if version number or date/time is a natural key for spark 
source code, is it? If you don't have a natural key you can't build reliable 
workflow. How would you automatically git clone, reset, build, deploy and 
re-run your workflow - based on data collected by spark - if you don't even 
have git commit there? Lookup git commit hash by version and date... sure it 
works, but why users can't just access that info directly? I don't see ANY 
reason to not have that feature. If you have any I would be glad to read. And 
no, even for developers that info is not available on runtime.


was (Author: jangorecki):
Record exact spark source code reference while processing ETL workflow so 
performance implication can be measures precisely referencing point in time of 
source code. I doubt if version number or date/time is a natural key for spark 
source code, is it? If you don't have a natural key you can't build reliable 
workflow. How would you automatically git clone, reset, build, deploy and 
re-run your workflow - based on data collected by spark - if you don't even 
have git commit there? Lookup git commit hash by version and date... sure it 
works, but why users can't just access that info directly? I don't see ANY 
reason to not have that feature? If you have any I would be glad to read. And 
no, even for developers that info is not available on runtime.

> Comprehensive version info 
> ---
>
> Key: SPARK-16864
> URL: https://issues.apache.org/jira/browse/SPARK-16864
> Project: Spark
>  Issue Type: Improvement
>Reporter: jay vyas
>
> Spark versions can be grepped out of the Spark banner that comes up on 
> startup, but otherwise, there is no programmatic/reliable way to get version 
> information.
> Also there is no git commit id, etc.  So precise version checking isnt 
> possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-16864) Comprehensive version info

2016-08-05 Thread Jan Gorecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410238#comment-15410238
 ] 

Jan Gorecki edited comment on SPARK-16864 at 8/5/16 10:52 PM:
--

Hi, git commit is relevant to applications at runtime as long as the subject 
for an application (in any dimension) is spark itself. I don't understand why 
that info would not be included. This may not be a problem for people who build 
from source, they can eventually put that metadata in plaintext file (still an 
overhead). The bigger problem is for those who grab binaries and for example 
just want to track performance in their cluster over spark git history. Git 
commit hash is a natural key for a source code of a project, you won't find 
better field to reference source code. Referencing release version is a 
different thing.


was (Author: jangorecki):
Hi, git commit is relevant to applications at runtime as long as the subject 
for an application (in any dimension) is spark itself. I don't understand why 
that info would not be included. This may not be a problem for people who build 
from source, they can eventually put that metadata in plaintext file (still an 
overhead). The bigger problem is for those who grab binaries and for example 
just want to track performance in their cluster over spark git history. Git 
commit hash is a natural key for a source code a project, you won't find better 
field to references the source code. Referencing release versions is simply a 
different thing.

> Comprehensive version info 
> ---
>
> Key: SPARK-16864
> URL: https://issues.apache.org/jira/browse/SPARK-16864
> Project: Spark
>  Issue Type: Improvement
>Reporter: jay vyas
>
> Spark versions can be grepped out of the Spark banner that comes up on 
> startup, but otherwise, there is no programmatic/reliable way to get version 
> information.
> Also there is no git commit id, etc.  So precise version checking isnt 
> possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-16864) Comprehensive version info

2016-08-05 Thread Jan Gorecki (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15410238#comment-15410238
 ] 

Jan Gorecki edited comment on SPARK-16864 at 8/5/16 10:51 PM:
--

Hi, git commit is relevant to applications at runtime as long as the subject 
for an application (in any dimension) is spark itself. I don't understand why 
that info would not be included. This may not be a problem for people who build 
from source, they can eventually put that metadata in plaintext file (still an 
overhead). The bigger problem is for those who grab binaries and for example 
just want to track performance in their cluster over spark git history. Git 
commit hash is a natural key for a source code a project, you won't find better 
field to references the source code. Referencing release versions is simply a 
different thing.


was (Author: jangorecki):
Hi, git commit is relevant to applications at runtime as long as the subject 
for an application (in any dimension) is spark itself. I don't understand why 
that info would not be included. This may not be a problem for people who build 
from source, they can eventually put that metadata in plaintext file (still an 
overhead). The bigger problem is for those who just grab binaries and for 
example just want to track performance in their cluster over spark git history. 
Git commit hash is a natural key for a source code a project, you won't find 
better field to references the source code. Referencing release versions is 
simply a different thing.

> Comprehensive version info 
> ---
>
> Key: SPARK-16864
> URL: https://issues.apache.org/jira/browse/SPARK-16864
> Project: Spark
>  Issue Type: Improvement
>Reporter: jay vyas
>
> Spark versions can be grepped out of the Spark banner that comes up on 
> startup, but otherwise, there is no programmatic/reliable way to get version 
> information.
> Also there is no git commit id, etc.  So precise version checking isnt 
> possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org