[jira] [Commented] (PIO-30) Cross build for different versions of scala and spark

2017-03-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PIO-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930020#comment-15930020
 ] 

ASF GitHub Bot commented on PIO-30:
---

GitHub user shimamoto opened a pull request:

https://github.com/apache/incubator-predictionio/pull/364

[PIO-30] Spark 2 support on sbt way

The goal of this PR is:
- Define the versions of Scala to build against in the crossScalaVersions 
setting
- Scala 2.10 and 2.11
- At the project, run `make-distribution.sh`. Then, create all versions gzs
- PredictionIO_2.10-0.11.0-$VERSION.tar.gz <- This is intended for 
Spark 1 users
- PredictionIO_2.11-0.11.0-$VERSION.tar.gz <- This is intended for 
Spark 2 users
- SQLContext maintain the status quo
- I wish support for Scala 2.10 is deprecated as of next release
- Versions of Hadoop are 2.6+, which has something to do with storage hdfs
- 2.6 is in lib/spark(default)
- 2.7 is in lib/extra

@chanlee514 @dszeto
I did #345.

## TODO
Currently, only the traditional version of the test works. It needs to be 
added new version (Spark 2) tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shimamoto/incubator-predictionio scala211

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-predictionio/pull/364.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #364


commit 432e0b85001bf4cd6f4d09e3a84dc95b2b5c7464
Author: shimamoto 
Date:   2017-03-09T12:07:14Z

Build against Scala 2.10 and 2.11.

commit 8d74bb8cc40619979b7af984ff4918759b6edba1
Author: shimamoto 
Date:   2017-03-17T05:50:43Z

Merge remote-tracking branch 'upstreamjp/develop' into scala211

commit 6a2dbdf865b271b1db0abf5dcebcdfe372d52b44
Author: shimamoto 
Date:   2017-03-17T07:56:03Z

fixup




> Cross build for different versions of scala and spark
> -
>
> Key: PIO-30
> URL: https://issues.apache.org/jira/browse/PIO-30
> Project: PredictionIO
>  Issue Type: Improvement
>Reporter: Marcin Ziemiński
>Assignee: Chan
> Fix For: 0.11.0
>
>
> The present version of Scala is 2.10 and Spark is 1.4, which is quite old. 
> With Spark 2.0.0 come many performance improvements and features, that people 
> will definitely like to add to their templates. I am also aware that past 
> cannot be ignored and simply dumping 1.x might not be an option for other 
> users. 
> I propose setting up a crossbuild in sbt to build with scala 2.10 and Spark 
> 1.6 and a separate one for Scala 2.11 and Spark 2.0. Most of the files will 
> be consistent between versions including API. The problematic ones will be 
> divided between additional source directories: src/main/scala-2.10/ and 
> src/main/scala-2.11/. The dockerized tests should also take the two versions 
> into consideration



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PIO-30) Cross build for different versions of scala and spark

2017-03-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PIO-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930044#comment-15930044
 ] 

ASF GitHub Bot commented on PIO-30:
---

Github user dszeto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/364
  
@shimamoto I realized the original build system was quite inflexible, so I 
went ahead and did this in the last couple days: 
https://travis-ci.org/apache/incubator-predictionio/builds/212014483. It's very 
close to working with tests across different dependencies. My apologies without 
syncing up with you.

The build system on that branch basically allows you to do this:
```
./make-distribution.sh -Dscala.version=2.10.6 -Dspark.version=2.1.0 
-Dhadoop.version=2.7.3 -Delasticsearch.version=5.2.2
```
The script does not cross build, but `crossScalaVersions` in `build.sbt` is 
already defined, so it's possible to cross build and cross publish artifacts. 
`make-distribution.sh` could possibly be extended to produce more than one 
tarballs as well.

Also, before `docker-compose` becomes available on Apache Jenkins, I am 
using that as a multi-build just to make sure things compile across versions: 
https://builds.apache.org/job/incubator-predictionio-multibuild/

Could you take a look at this branch and see how it looks to you? It would 
be great to converge our changes.


> Cross build for different versions of scala and spark
> -
>
> Key: PIO-30
> URL: https://issues.apache.org/jira/browse/PIO-30
> Project: PredictionIO
>  Issue Type: Improvement
>Reporter: Marcin Ziemiński
>Assignee: Chan
> Fix For: 0.11.0
>
>
> The present version of Scala is 2.10 and Spark is 1.4, which is quite old. 
> With Spark 2.0.0 come many performance improvements and features, that people 
> will definitely like to add to their templates. I am also aware that past 
> cannot be ignored and simply dumping 1.x might not be an option for other 
> users. 
> I propose setting up a crossbuild in sbt to build with scala 2.10 and Spark 
> 1.6 and a separate one for Scala 2.11 and Spark 2.0. Most of the files will 
> be consistent between versions including API. The problematic ones will be 
> divided between additional source directories: src/main/scala-2.10/ and 
> src/main/scala-2.11/. The dockerized tests should also take the two versions 
> into consideration



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-predictionio issue #364: [PIO-30] Spark 2 support on sbt way

2017-03-17 Thread dszeto
Github user dszeto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/364
  
@shimamoto I realized the original build system was quite inflexible, so I 
went ahead and did this in the last couple days: 
https://travis-ci.org/apache/incubator-predictionio/builds/212014483. It's very 
close to working with tests across different dependencies. My apologies without 
syncing up with you.

The build system on that branch basically allows you to do this:
```
./make-distribution.sh -Dscala.version=2.10.6 -Dspark.version=2.1.0 
-Dhadoop.version=2.7.3 -Delasticsearch.version=5.2.2
```
The script does not cross build, but `crossScalaVersions` in `build.sbt` is 
already defined, so it's possible to cross build and cross publish artifacts. 
`make-distribution.sh` could possibly be extended to produce more than one 
tarballs as well.

Also, before `docker-compose` becomes available on Apache Jenkins, I am 
using that as a multi-build just to make sure things compile across versions: 
https://builds.apache.org/job/incubator-predictionio-multibuild/

Could you take a look at this branch and see how it looks to you? It would 
be great to converge our changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio pull request #364: [PIO-30] Spark 2 support on sbt wa...

2017-03-17 Thread shimamoto
GitHub user shimamoto opened a pull request:

https://github.com/apache/incubator-predictionio/pull/364

[PIO-30] Spark 2 support on sbt way

The goal of this PR is:
- Define the versions of Scala to build against in the crossScalaVersions 
setting
- Scala 2.10 and 2.11
- At the project, run `make-distribution.sh`. Then, create all versions gzs
- PredictionIO_2.10-0.11.0-$VERSION.tar.gz <- This is intended for 
Spark 1 users
- PredictionIO_2.11-0.11.0-$VERSION.tar.gz <- This is intended for 
Spark 2 users
- SQLContext maintain the status quo
- I wish support for Scala 2.10 is deprecated as of next release
- Versions of Hadoop are 2.6+, which has something to do with storage hdfs
- 2.6 is in lib/spark(default)
- 2.7 is in lib/extra

@chanlee514 @dszeto
I did #345.

## TODO
Currently, only the traditional version of the test works. It needs to be 
added new version (Spark 2) tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/shimamoto/incubator-predictionio scala211

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-predictionio/pull/364.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #364


commit 432e0b85001bf4cd6f4d09e3a84dc95b2b5c7464
Author: shimamoto 
Date:   2017-03-09T12:07:14Z

Build against Scala 2.10 and 2.11.

commit 8d74bb8cc40619979b7af984ff4918759b6edba1
Author: shimamoto 
Date:   2017-03-17T05:50:43Z

Merge remote-tracking branch 'upstreamjp/develop' into scala211

commit 6a2dbdf865b271b1db0abf5dcebcdfe372d52b44
Author: shimamoto 
Date:   2017-03-17T07:56:03Z

fixup




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (PIO-30) Cross build for different versions of scala and spark

2017-03-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PIO-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930383#comment-15930383
 ] 

ASF GitHub Bot commented on PIO-30:
---

Github user shimamoto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/364
  
@chanlee514 I looked over the branch :)
I would like to confirm couple of things.

- Which versions are you going to make prebuilt packages?
- Do you mean that predictionio also supports versions that require users 
to build Spark(ex. Scala 2.10 & Spark 2.10)? This is just my personal opinion 
but, .my impression is that 2.10 is intended for Spark 1 users and 2.11 is 
intended for Spark 2 users.
- ES5 has `elasticsearch-spark-13` dependency. We need to use 
`elasticsearch-spark-20` for Spark 2.



> Cross build for different versions of scala and spark
> -
>
> Key: PIO-30
> URL: https://issues.apache.org/jira/browse/PIO-30
> Project: PredictionIO
>  Issue Type: Improvement
>Reporter: Marcin Ziemiński
>Assignee: Chan
> Fix For: 0.11.0
>
>
> The present version of Scala is 2.10 and Spark is 1.4, which is quite old. 
> With Spark 2.0.0 come many performance improvements and features, that people 
> will definitely like to add to their templates. I am also aware that past 
> cannot be ignored and simply dumping 1.x might not be an option for other 
> users. 
> I propose setting up a crossbuild in sbt to build with scala 2.10 and Spark 
> 1.6 and a separate one for Scala 2.11 and Spark 2.0. Most of the files will 
> be consistent between versions including API. The problematic ones will be 
> divided between additional source directories: src/main/scala-2.10/ and 
> src/main/scala-2.11/. The dockerized tests should also take the two versions 
> into consideration



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (PIO-30) Cross build for different versions of scala and spark

2017-03-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/PIO-30?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15930568#comment-15930568
 ] 

ASF GitHub Bot commented on PIO-30:
---

Github user dszeto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/364
  
@shimamoto We can't really release binary packages on Apache yet until we 
have reviewed all packaged transitive dependencies comply with Apache policies. 
It will be a non-trivial exercise. Many people want to ship 0.11 as soon as 
possible, so it would probably be the next release. We will still require users 
to build their own package, so cross building everything in one shot would take 
too much time for them.

The goal is mostly about being flexible to use different dependencies 
instead of enforcing a set. We probably will only officially test Scala 2.10 + 
Spark 1.6, and Scala 2.11 + Spark 2. I tried other combos but haven't been able 
to make them work yet.

Agree the ES dep need to be fixed.


> Cross build for different versions of scala and spark
> -
>
> Key: PIO-30
> URL: https://issues.apache.org/jira/browse/PIO-30
> Project: PredictionIO
>  Issue Type: Improvement
>Reporter: Marcin Ziemiński
>Assignee: Chan
> Fix For: 0.11.0
>
>
> The present version of Scala is 2.10 and Spark is 1.4, which is quite old. 
> With Spark 2.0.0 come many performance improvements and features, that people 
> will definitely like to add to their templates. I am also aware that past 
> cannot be ignored and simply dumping 1.x might not be an option for other 
> users. 
> I propose setting up a crossbuild in sbt to build with scala 2.10 and Spark 
> 1.6 and a separate one for Scala 2.11 and Spark 2.0. Most of the files will 
> be consistent between versions including API. The problematic ones will be 
> divided between additional source directories: src/main/scala-2.10/ and 
> src/main/scala-2.11/. The dockerized tests should also take the two versions 
> into consideration



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] incubator-predictionio issue #364: [PIO-30] Spark 2 support on sbt way

2017-03-17 Thread dszeto
Github user dszeto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/364
  
@shimamoto We can't really release binary packages on Apache yet until we 
have reviewed all packaged transitive dependencies comply with Apache policies. 
It will be a non-trivial exercise. Many people want to ship 0.11 as soon as 
possible, so it would probably be the next release. We will still require users 
to build their own package, so cross building everything in one shot would take 
too much time for them.

The goal is mostly about being flexible to use different dependencies 
instead of enforcing a set. We probably will only officially test Scala 2.10 + 
Spark 1.6, and Scala 2.11 + Spark 2. I tried other combos but haven't been able 
to make them work yet.

Agree the ES dep need to be fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-predictionio issue #364: [PIO-30] Spark 2 support on sbt way

2017-03-17 Thread shimamoto
Github user shimamoto commented on the issue:

https://github.com/apache/incubator-predictionio/pull/364
  
@chanlee514 I looked over the branch :)
I would like to confirm couple of things.

- Which versions are you going to make prebuilt packages?
- Do you mean that predictionio also supports versions that require users 
to build Spark(ex. Scala 2.10 & Spark 2.10)? This is just my personal opinion 
but, .my impression is that 2.10 is intended for Spark 1 users and 2.11 is 
intended for Spark 2 users.
- ES5 has `elasticsearch-spark-13` dependency. We need to use 
`elasticsearch-spark-20` for Spark 2.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---