[ 
https://issues.apache.org/jira/browse/SPARK-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660420#comment-14660420
 ] 

Patrick Wendell commented on SPARK-1517:
----------------------------------------

Hey Ryan,

For the maven snapshot releases - unfortunately we are constrained by maven's 
own SNAPSHOT version format which doesn't allow encoding anything other than 
the timestamp. It's just not supported in their SNAPSHOT mechanism. However, 
one thing we could see is whether we can align the timestamp with the time of 
the actual spark commit, rather than the time of publication of the SNAPSHOT 
release. I'm not sure if maven lets you provide a custom timestamp when 
publishing. If we had that feature users could look at the Spark commit log and 
do some manual association.

For the binaries, the reason why the same commit appears multiple times is that 
we do the build every four hours and always publish the latest one even if it's 
a duplicate. However, this could be modified pretty easily to just avoid 
double-publishing the same commit if there hasn't been any code change. Maybe 
create a JIRA for this?

In terms of how many older versions are available, the scripts we use for this 
have a tunable retention window. Right now I'm only keeping the last 4 builds, 
we could probably extend it to something like 10 builds. However, at some point 
I'm likely to blow out of space in my ASF user account. Since the binaries are 
quite large, I don't think at least using ASF infrastructure it's feasible to 
keep all past builds. We have 3000 commits in a typical Spark release, and it's 
a few gigs for each binary build.

> Publish nightly snapshots of documentation, maven artifacts, and binary builds
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-1517
>                 URL: https://issues.apache.org/jira/browse/SPARK-1517
>             Project: Spark
>          Issue Type: Improvement
>          Components: Build, Project Infra
>            Reporter: Patrick Wendell
>            Assignee: Patrick Wendell
>            Priority: Critical
>
> Should be pretty easy to do with Jenkins. The only thing I can think of that 
> would be tricky is to set up credentials so that jenkins can publish this 
> stuff somewhere on apache infra.
> Ideally we don't want to have to put a private key on every jenkins box 
> (since they are otherwise pretty stateless). One idea is to encrypt these 
> credentials with a passphrase and post them somewhere publicly visible. Then 
> the jenkins build can download the credentials provided we set a passphrase 
> in an environment variable in jenkins. There may be simpler solutions as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to