Re: [discuss] SparkR CRAN feasibility check server problem

2018-12-12 Thread Liang-Chi Hsieh
Just got reply from CRAN admin. It should be fixed now. Hyukjin Kwon wrote > Thanks, Liang-chi. > > On Thu, 13 Dec 2018, 8:29 am Liang-Chi Hsieh > viirya@ > wrote: > > >> Sorry for late. There is a malformed record at CRAN package page again. >> I've >> already asked CRAN admin for

classpath entries with hdfs path

2018-12-12 Thread sandeep_katta
Hi All; I have use case where some of the jars on HDFS, these jars I want to include in my driver class path if I pass with --jars it works fine, but if I pass using spark.driver.extraClassPath it is failed spark-sql --master yarn --jars hdfs://hacluster/tmp/testjar/* //Jars are loaded to the

dsv2 remaining work

2018-12-12 Thread Reynold Xin
Unfortunately I can't make it to the DSv2 sync today. Sending an email with my thoughts instead. I spent a few hours thinking about this. It's evident that progress has been slow, because this is an important API and people from different perspectives have very different requirements, and the

Re: [discuss] SparkR CRAN feasibility check server problem

2018-12-12 Thread Hyukjin Kwon
Thanks, Liang-chi. On Thu, 13 Dec 2018, 8:29 am Liang-Chi Hsieh > Sorry for late. There is a malformed record at CRAN package page again. > I've > already asked CRAN admin for help. It should be fixed soon according to > past > experience. > > Related discussion will be in >

Re: [discuss] SparkR CRAN feasibility check server problem

2018-12-12 Thread Liang-Chi Hsieh
Sorry for late. There is a malformed record at CRAN package page again. I've already asked CRAN admin for help. It should be fixed soon according to past experience. Related discussion will be in https://issues.apache.org/jira/browse/SPARK-24152. I will post here if I get reply from CRAN admin.

Re: Self join

2018-12-12 Thread Ryan Blue
Marco, I'm actually asking for a design doc that clearly states the problem and proposes a solution. This is a substantial change and probably should be an SPIP. I think that would be more likely to generate discussion than referring to PRs or a quick paragraph on the dev list, because the only

Re: Ask for reviewing on Structured Streaming PRs

2018-12-12 Thread Vaclav Kosar
I am also waiting for any finalization of my PR [3]. I seems that SS PRs are not being reviewed much these days. [3] https://github.com/apache/spark/pull/21919 On 12. 12. 18 14:37, Dongjin Lee wrote: If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was

Re: [discuss] SparkR CRAN feasibility check server problem

2018-12-12 Thread Liang-Chi Hsieh
Thanks for letting me know! I will look into it and ask CRAN admin for help. Hyukjin Kwon wrote > Looks it's happening again. Liang-Chi, do you mind if I ask it again? > > FYI, R 3.4 is officially deprecated as of > https://github.com/apache/spark/pull/23012 > We could upgrade R version to

Re: Ask for reviewing on Structured Streaming PRs

2018-12-12 Thread Dongjin Lee
If it is possible, could you review my PR on Kafka's header functionality[^1] also? It was added in Kafka 0.11.0.0 but still not supported in Spark. Thanks, Dongjin [^1]: https://github.com/apache/spark/pull/22282 [^2]: https://issues.apache.org/jira/browse/KAFKA-4208 On Wed, Dec 12, 2018 at

Re: [discuss] SparkR CRAN feasibility check server problem

2018-12-12 Thread Hyukjin Kwon
Looks it's happening again. Liang-Chi, do you mind if I ask it again? FYI, R 3.4 is officially deprecated as of https://github.com/apache/spark/pull/23012 We could upgrade R version to 3.4.x in Jenkins, which deals with the malformed(?) responses after 3.0 release. Then, we could get rid of this

Re: Behavior of checkpointLocation from options vs setting conf spark.sql.streaming.checkpointLocation

2018-12-12 Thread Shubham Chaurasia
Thanks Gabor. On Wed, Dec 12, 2018, 4:06 PM Gabor Somogyi Hi Shubham, > > I've just checked the latest master branch and I can confirm it works as > you've described. > As a workaround one can read the ** in the directory > structure and can be set with .queryName("") before > restart. > > BR, >

Re: Behavior of checkpointLocation from options vs setting conf spark.sql.streaming.checkpointLocation

2018-12-12 Thread Gabor Somogyi
Hi Shubham, I've just checked the latest master branch and I can confirm it works as you've described. As a workaround one can read the ** in the directory structure and can be set with .queryName("") before restart. BR, G On Tue, Dec 11, 2018 at 6:45 AM Shubham Chaurasia wrote: > Hi, > > I

Re: Self join

2018-12-12 Thread Marco Gaido
Thank you all for your answers. @Ryan Blue sure, let me state the problem more clearly: imagine you have 2 dataframes with a common lineage (for instance one is derived from the other by some filtering or anything you prefer). And imagine you want to join these 2 dataframes. Currently, there is

Ask for reviewing on Structured Streaming PRs

2018-12-12 Thread Jungtaek Lim
Hi devs, Would I kindly ask for reviewing on PRs for Structured Streaming? I have 5 open pull requests on SS side [1] (earliest PR was opened around 4 months so far), and there looks like couple of PR for others [2] which looks good to be reviewed, too. Thanks in advance, Jungtaek Lim