[jira] [Commented] (SPARK-23154) Document backwards compatibility guarantees for ML persistence

2018-02-12 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361654#comment-16361654
 ] 

Apache Spark commented on SPARK-23154:
--

User 'jkbradley' has created a pull request for this issue:
https://github.com/apache/spark/pull/20592

> Document backwards compatibility guarantees for ML persistence
> --
>
> Key: SPARK-23154
> URL: https://issues.apache.org/jira/browse/SPARK-23154
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Affects Versions: 2.3.0
>Reporter: Joseph K. Bradley
>Assignee: Joseph K. Bradley
>Priority: Major
>
> We have (as far as I know) maintained backwards compatibility for ML 
> persistence, but this is not documented anywhere.  I'd like us to document it 
> (for spark.ml, not for spark.mllib).
> I'd recommend something like:
> {quote}
> In general, MLlib maintains backwards compatibility for ML persistence.  
> I.e., if you save an ML model or Pipeline in one version of Spark, then you 
> should be able to load it back and use it in a future version of Spark.  
> However, there are rare exceptions, described below.
> Model persistence: Is a model or Pipeline saved using Apache Spark ML 
> persistence in Spark version X loadable by Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Yes; these are backwards compatible.
> * Note about the format: There are no guarantees for a stable persistence 
> format, but model loading itself is designed to be backwards compatible.
> Model behavior: Does a model or Pipeline in Spark version X behave 
> identically in Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Identical behavior, except for bug fixes.
> For both model persistence and model behavior, any breaking changes across a 
> minor version or patch version are reported in the Spark version release 
> notes. If a breakage is not reported in release notes, then it should be 
> treated as a bug to be fixed.
> {quote}
> How does this sound?
> Note: We unfortunately don't have tests for backwards compatibility (which 
> has technical hurdles and can be discussed in [SPARK-15573]).  However, we 
> have made efforts to maintain it during PR review and Spark release QA, and 
> most users expect it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23154) Document backwards compatibility guarantees for ML persistence

2018-02-12 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16361575#comment-16361575
 ] 

Joseph K. Bradley commented on SPARK-23154:
---

I'd prefer to put it in the subsection on saving & loading.  I'll send a PR now.

[~yanboliang] I actually spent a long time trying to come up with ways to test 
this, and it's non-trivial.  The main blocker is that I got pushback from 
others about putting binary files (Parquet model data files) in the git repo.  
Without that, there isn't a way to store example models from past versions.  I 
may just build a separate project to test this outside of apache/spark itself 
when I get the chance.  You can find more notes in the JIRA linked in the 
description above.

> Document backwards compatibility guarantees for ML persistence
> --
>
> Key: SPARK-23154
> URL: https://issues.apache.org/jira/browse/SPARK-23154
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Affects Versions: 2.3.0
>Reporter: Joseph K. Bradley
>Assignee: Joseph K. Bradley
>Priority: Major
>
> We have (as far as I know) maintained backwards compatibility for ML 
> persistence, but this is not documented anywhere.  I'd like us to document it 
> (for spark.ml, not for spark.mllib).
> I'd recommend something like:
> {quote}
> In general, MLlib maintains backwards compatibility for ML persistence.  
> I.e., if you save an ML model or Pipeline in one version of Spark, then you 
> should be able to load it back and use it in a future version of Spark.  
> However, there are rare exceptions, described below.
> Model persistence: Is a model or Pipeline saved using Apache Spark ML 
> persistence in Spark version X loadable by Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Yes; these are backwards compatible.
> * Note about the format: There are no guarantees for a stable persistence 
> format, but model loading itself is designed to be backwards compatible.
> Model behavior: Does a model or Pipeline in Spark version X behave 
> identically in Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Identical behavior, except for bug fixes.
> For both model persistence and model behavior, any breaking changes across a 
> minor version or patch version are reported in the Spark version release 
> notes. If a breakage is not reported in release notes, then it should be 
> treated as a bug to be fixed.
> {quote}
> How does this sound?
> Note: We unfortunately don't have tests for backwards compatibility (which 
> has technical hurdles and can be discussed in [SPARK-15573]).  However, we 
> have made efforts to maintain it during PR review and Spark release QA, and 
> most users expect it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23154) Document backwards compatibility guarantees for ML persistence

2018-01-30 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344885#comment-16344885
 ] 

Nick Pentreath commented on SPARK-23154:


Where do we intend to put this note? In 
[http://spark.apache.org/docs/latest/ml-pipeline.html#saving-and-loading-pipelines?]
 Or as a new section in [http://spark.apache.org/docs/latest/ml-guide.html]?

> Document backwards compatibility guarantees for ML persistence
> --
>
> Key: SPARK-23154
> URL: https://issues.apache.org/jira/browse/SPARK-23154
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Affects Versions: 2.3.0
>Reporter: Joseph K. Bradley
>Assignee: Joseph K. Bradley
>Priority: Major
>
> We have (as far as I know) maintained backwards compatibility for ML 
> persistence, but this is not documented anywhere.  I'd like us to document it 
> (for spark.ml, not for spark.mllib).
> I'd recommend something like:
> {quote}
> In general, MLlib maintains backwards compatibility for ML persistence.  
> I.e., if you save an ML model or Pipeline in one version of Spark, then you 
> should be able to load it back and use it in a future version of Spark.  
> However, there are rare exceptions, described below.
> Model persistence: Is a model or Pipeline saved using Apache Spark ML 
> persistence in Spark version X loadable by Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Yes; these are backwards compatible.
> * Note about the format: There are no guarantees for a stable persistence 
> format, but model loading itself is designed to be backwards compatible.
> Model behavior: Does a model or Pipeline in Spark version X behave 
> identically in Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Identical behavior, except for bug fixes.
> For both model persistence and model behavior, any breaking changes across a 
> minor version or patch version are reported in the Spark version release 
> notes. If a breakage is not reported in release notes, then it should be 
> treated as a bug to be fixed.
> {quote}
> How does this sound?
> Note: We unfortunately don't have tests for backwards compatibility (which 
> has technical hurdles and can be discussed in [SPARK-15573]).  However, we 
> have made efforts to maintain it during PR review and Spark release QA, and 
> most users expect it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23154) Document backwards compatibility guarantees for ML persistence

2018-01-22 Thread Yanbo Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334757#comment-16334757
 ] 

Yanbo Liang commented on SPARK-23154:
-

Sounds good! It should be helpful to document backwards compatibility. Further 
more, I think we can write some tools to test the backwards compatibility for 
ML persistence during QA of releasing, just like performance regression test. 
Thanks.

> Document backwards compatibility guarantees for ML persistence
> --
>
> Key: SPARK-23154
> URL: https://issues.apache.org/jira/browse/SPARK-23154
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Affects Versions: 2.3.0
>Reporter: Joseph K. Bradley
>Assignee: Joseph K. Bradley
>Priority: Major
>
> We have (as far as I know) maintained backwards compatibility for ML 
> persistence, but this is not documented anywhere.  I'd like us to document it 
> (for spark.ml, not for spark.mllib).
> I'd recommend something like:
> {quote}
> In general, MLlib maintains backwards compatibility for ML persistence.  
> I.e., if you save an ML model or Pipeline in one version of Spark, then you 
> should be able to load it back and use it in a future version of Spark.  
> However, there are rare exceptions, described below.
> Model persistence: Is a model or Pipeline saved using Apache Spark ML 
> persistence in Spark version X loadable by Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Yes; these are backwards compatible.
> * Note about the format: There are no guarantees for a stable persistence 
> format, but model loading itself is designed to be backwards compatible.
> Model behavior: Does a model or Pipeline in Spark version X behave 
> identically in Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Identical behavior, except for bug fixes.
> For both model persistence and model behavior, any breaking changes across a 
> minor version or patch version are reported in the Spark version release 
> notes. If a breakage is not reported in release notes, then it should be 
> treated as a bug to be fixed.
> {quote}
> How does this sound?
> Note: We unfortunately don't have tests for backwards compatibility (which 
> has technical hurdles and can be discussed in [SPARK-15573]).  However, we 
> have made efforts to maintain it during PR review and Spark release QA, and 
> most users expect it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23154) Document backwards compatibility guarantees for ML persistence

2018-01-19 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332252#comment-16332252
 ] 

Nick Pentreath commented on SPARK-23154:


SGTM

> Document backwards compatibility guarantees for ML persistence
> --
>
> Key: SPARK-23154
> URL: https://issues.apache.org/jira/browse/SPARK-23154
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Affects Versions: 2.3.0
>Reporter: Joseph K. Bradley
>Assignee: Joseph K. Bradley
>Priority: Major
>
> We have (as far as I know) maintained backwards compatibility for ML 
> persistence, but this is not documented anywhere.  I'd like us to document it 
> (for spark.ml, not for spark.mllib).
> I'd recommend something like:
> {quote}
> In general, MLlib maintains backwards compatibility for ML persistence.  
> I.e., if you save an ML model or Pipeline in one version of Spark, then you 
> should be able to load it back and use it in a future version of Spark.  
> However, there are rare exceptions, described below.
> Model persistence: Is a model or Pipeline saved using Apache Spark ML 
> persistence in Spark version X loadable by Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Yes; these are backwards compatible.
> * Note about the format: There are no guarantees for a stable persistence 
> format, but model loading itself is designed to be backwards compatible.
> Model behavior: Does a model or Pipeline in Spark version X behave 
> identically in Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Identical behavior, except for bug fixes.
> For both model persistence and model behavior, any breaking changes across a 
> minor version or patch version are reported in the Spark version release 
> notes. If a breakage is not reported in release notes, then it should be 
> treated as a bug to be fixed.
> {quote}
> How does this sound?
> Note: We unfortunately don't have tests for backwards compatibility (which 
> has technical hurdles and can be discussed in [SPARK-15573]).  However, we 
> have made efforts to maintain it during PR review and Spark release QA, and 
> most users expect it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23154) Document backwards compatibility guarantees for ML persistence

2018-01-18 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16331811#comment-16331811
 ] 

Joseph K. Bradley commented on SPARK-23154:
---

CC [~mlnick], [~yanboliang] or others, what do you think?

> Document backwards compatibility guarantees for ML persistence
> --
>
> Key: SPARK-23154
> URL: https://issues.apache.org/jira/browse/SPARK-23154
> Project: Spark
>  Issue Type: Documentation
>  Components: Documentation, ML
>Affects Versions: 2.3.0
>Reporter: Joseph K. Bradley
>Assignee: Joseph K. Bradley
>Priority: Major
>
> We have (as far as I know) maintained backwards compatibility for ML 
> persistence, but this is not documented anywhere.  I'd like us to document it 
> (for spark.ml, not for spark.mllib).
> I'd recommend something like:
> {quote}
> In general, MLlib maintains backwards compatibility for ML persistence.  
> I.e., if you save an ML model or Pipeline in one version of Spark, then you 
> should be able to load it back and use it in a future version of Spark.  
> However, there are rare exceptions, described below.
> Model persistence: Is a model or Pipeline saved using Apache Spark ML 
> persistence in Spark version X loadable by Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Yes; these are backwards compatible.
> * Note about the format: There are no guarantees for a stable persistence 
> format, but model loading itself is designed to be backwards compatible.
> Model behavior: Does a model or Pipeline in Spark version X behave 
> identically in Spark version Y?
> * Major versions: No guarantees, but best-effort.
> * Minor and patch versions: Identical behavior, except for bug fixes.
> For both model persistence and model behavior, any breaking changes across a 
> minor version or patch version are reported in the Spark version release 
> notes. If a breakage is not reported in release notes, then it should be 
> treated as a bug to be fixed.
> {quote}
> How does this sound?
> Note: We unfortunately don't have tests for backwards compatibility (which 
> has technical hurdles and can be discussed in [SPARK-15573]).  However, we 
> have made efforts to maintain it during PR review and Spark release QA, and 
> most users expect it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org