Re: Unable to run docker jdbc integrations test ?

2016-09-07 Thread Luciano Resende
That might be a reasonable and much more simpler approach to try... but if we resolve these issues, we should make it part of some frequent build to make sure the build don't regress and that the actual functionality don't regress either. Let me look into this again... On Wed, Sep 7, 2016 at 2:46

Re: Unable to run docker jdbc integrations test ?

2016-09-07 Thread Josh Rosen
I think that these tests are valuable so I'd like to keep them. If possible, though, we should try to get rid of our dependency on the Spotify docker-client library, since it's a dependency hell nightmare. Given our relatively simple use of Docker here, I wonder whether we could just write some

Re: Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Matei Zaharia
The question is just whether the metadata and instructions involving these Maven packages counts as sufficient to tell the user that they have different licensing terms. For example, our Ganglia package was called spark-ganglia-lgpl (so you'd notice it's a different license even from its name),

Re: Unable to run docker jdbc integrations test ?

2016-09-07 Thread Luciano Resende
It looks like there is nobody running these tests, and after some dependency upgrades in Spark 2.0 this has stopped working. I have tried to bring up this but I am having some issues with getting the right dependencies loaded and satisfying the docker-client expectations. The question then is:

Re: Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Cody Koeninger
To be clear, "safe" has very little to do with this. It's pretty clear that there's very little risk of the spark module for kinesis being considered a derivative work, much less all of spark. The use limitation in 3.3 that caused the amazon license to be put on the apache X list also doesn't

Re: Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Luciano Resende
On Wed, Sep 7, 2016 at 12:20 PM, Mridul Muralidharan wrote: > > It is good to get clarification, but the way I read it, the issue is > whether we publish it as official Apache artifacts (in maven, etc). > > Users can of course build it directly (and we can make it easy to do

Re: Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Luciano Resende
On Wed, Sep 7, 2016 at 11:57 AM, Matei Zaharia wrote: > I think you should ask legal about how to have some Maven artifacts for > these. Both Ganglia and Kinesis are very widely used, so it's weird to ask > users to build them from source. Maybe the Maven artifacts can

Re: Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Mridul Muralidharan
It is good to get clarification, but the way I read it, the issue is whether we publish it as official Apache artifacts (in maven, etc). Users can of course build it directly (and we can make it easy to do so) - as they are explicitly agreeing to additional licenses. Regards Mridul On

Re: Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Sean Owen
Agree, I've asked the question on that thread and will follow it up. I'd prefer not to pull these unless it's fairly clear it's going to be against policy. On Wed, Sep 7, 2016 at 7:57 PM, Matei Zaharia wrote: > I think you should ask legal about how to have some Maven

Re: Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Matei Zaharia
I think you should ask legal about how to have some Maven artifacts for these. Both Ganglia and Kinesis are very widely used, so it's weird to ask users to build them from source. Maybe the Maven artifacts can be marked as being under a different license? In the initial discussion for

Re: Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Sean Owen
(Credit to Luciano for pointing it out) Yes it's clear why the assembly can't be published but I had the same question about the non-assembly Kinesis (and ganglia) artifact, because the published artifact has no code from Kinesis. See the related discussion at

Re: Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Mridul Muralidharan
I agree, we should not be publishing both of them. Thanks for bringing this up ! Regards, Mridul On Wed, Sep 7, 2016 at 1:29 AM, Sean Owen wrote: > It's worth calling attention to: > > https://issues.apache.org/jira/browse/SPARK-17418 >

Re: Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Cody Koeninger
I don't see a reason to remove the non-assembly artifact, why would you? You're not distributing copies of Amazon licensed code, and the Amazon license goes out of its way not to over-reach regarding derivative works. This seems pretty clearly to fall in the spirit of

Re: Discuss SparkR executors/workers support virtualenv

2016-09-07 Thread Shivaram Venkataraman
I think this makes sense -- making it easier to use additional R packages would be a good feature. I am not sure we need Packrat for this use case though. Lets continue discussion on the JIRA at https://issues.apache.org/jira/browse/SPARK-17428 Thanks Shivaram On Tue, Sep 6, 2016 at 11:36 PM,

Re: How to get 2 years prior date from currentdate using Spark Sql

2016-09-07 Thread Herman van Hövell tot Westerflier
This is more a @use question. You can write the following in sql: select date '2016-09-07' - interval 2 years HTH On Wed, Sep 7, 2016 at 3:14 PM, Yong Zhang wrote: > sorry, should be date_sub > > > https://issues.apache.org/jira/browse/SPARK-8187 > [SPARK-8187] date/time

Re: How to get 2 years prior date from currentdate using Spark Sql

2016-09-07 Thread Yong Zhang
sorry, should be date_sub https://issues.apache.org/jira/browse/SPARK-8187 [SPARK-8187] date/time function: date_sub - ASF JIRA issues.apache.org Apache Spark added a comment - 12/Jun/15 06:56 User 'adrian-wang' has created a pull request for

Re: How to get 2 years prior date from currentdate using Spark Sql

2016-09-07 Thread Yong Zhang
https://issues.apache.org/jira/browse/SPARK-8185 [SPARK-8185] date/time function: datediff - ASF JIRA issues.apache.org Spark; SPARK-8159 Improve expression function coverage (Spark 1.5) SPARK-8185; date/time function: datediff

How to get 2 years prior date from currentdate using Spark Sql

2016-09-07 Thread farman.bsse1855
I need to derive 2 years prior date of current date using a query in Spark Sql. For ex : today's date is 2016-09-07. I need to get the date exactly 2 years before this date in the above format (-MM-DD). Please let me know if there are multiple approaches and which one would be better. Thanks

implement UDF/UDAF supporting whole stage codegen

2016-09-07 Thread assaf.mendelson
Hi, I want to write a UDF/UDAF which provides native processing performance. Currently, when creating a UDF/UDAF in a normal manner the performance is hit because it breaks optimizations. For a simple example I wanted to create a UDF which tests whether the value is smaller than 10. I tried

Removing published kinesis, ganglia artifacts due to license issues?

2016-09-07 Thread Sean Owen
It's worth calling attention to: https://issues.apache.org/jira/browse/SPARK-17418 https://issues.apache.org/jira/browse/SPARK-17422 It looks like we need to at least not publish the kinesis *assembly* Maven artifact because it contains Amazon Software Licensed-code directly. However there's a

Discuss SparkR executors/workers support virtualenv

2016-09-07 Thread Yanbo Liang
Hi All, Many users have requirements to use third party R packages in executors/workers, but SparkR can not satisfy this requirements elegantly. For example, you should to mess with the IT/administrators of the cluster to deploy these R packages on each executors/workers node which is very