Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-14 Thread Felix Cheung
+1 tested SparkR package on Windows, r-hub, Ubuntu. _ From: Sean Owen > Sent: Thursday, September 14, 2017 3:12 PM Subject: Re: [VOTE] Spark 2.1.2 (RC1) To: Holden Karau >,

test

2017-09-14 Thread Seb Kiureghian
test

CHAR implementation?

2017-09-14 Thread Dongjoon Hyun
Hi, All. Currently, Spark shows different behavior when we uses CHAR types. spark-sql> CREATE TABLE t1(a CHAR(3)); spark-sql> CREATE TABLE t2(a CHAR(3)) STORED AS ORC; spark-sql> CREATE TABLE t3(a CHAR(3)) STORED AS PARQUET; spark-sql> INSERT INTO TABLE t1 SELECT 'a '; spark-sql> INSERT INTO

Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-14 Thread Sean Owen
+1 Very nice. The sigs and hashes look fine, it builds fine for me on Debian Stretch with Java 8, yarn/hive/hadoop-2.7 profiles, and passes tests. Yes as you say, no outstanding issues except for this which doesn't look critical, as it's not a regression. SPARK-21985 PySpark PairDeserializer is

Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-14 Thread Dongjoon Hyun
Yea. I think I found the root cause. The correct one is the following as Sean said. https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.1.2 The current RC vote email has the following. List of JIRA tickets resolved in this release can be found

Re: Futures timeout exception in executor logs

2017-09-14 Thread Xuefu Zhang
I saw this quite often in our clusters. we have increased spark.executor.heartbeatInterval to 60s from the default value, which should help. The problem seems due to poor Spark driver performance and/or locking issues when it cannot process incoming events quickly enough. Thanks, Xuefu On Thu,

Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-14 Thread Sean Owen
I think the search filter is OK, but for whatever reason the filter link includes what JIRA you're currently browsing, and that one is not actually included in the filter. It opens on a JIRA that's not included, but the search results look correct. project = SPARK AND fixVersion = 2.1.2 On Thu,

Re: [VOTE] Spark 2.1.2 (RC1)

2017-09-14 Thread Dongjoon Hyun
Hi, Holden. It's not a problem, but the link of `List of JIRA ... with this filter` seems to be wrong. Bests, Dongjoon. On Thu, Sep 14, 2017 at 10:47 AM, Holden Karau wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.1.2. The vote is

Re: Easy way to get offset metatada with Spark Streaming API

2017-09-14 Thread Michael Armbrust
Yep, that is correct. You can also use the query ID which is a GUID that is stored in the checkpoint and preserved across restarts if you want to distinguish the batches from different streams. sqlContext.sparkContext.getLocalProperty(StreamExecution.QUERY_ID_KEY) This was added recently

[VOTE] Spark 2.1.2 (RC1)

2017-09-14 Thread Holden Karau
Please vote on releasing the following candidate as Apache Spark version 2.1.2. The vote is open until Friday September 22nd at 18:00 PST and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 2.1.2 [ ] -1 Do not release this package because ...

Re: What is d3kbcqa49mib13.cloudfront.net ?

2017-09-14 Thread Sean Owen
I think the download could use the Apache mirror, yeah. I don't know if there's a reason that it must though. What's good enough for releases is good enough for this purpose. People might not like the big download in the tests if it really came up as an issue we could find ways to cache it better

Re: What is d3kbcqa49mib13.cloudfront.net ?

2017-09-14 Thread Mark Hamstra
The problem is that it's not really an "official" download link, but rather just a supplemental convenience. While that may be ok when distributing artifacts, it's more of a problem when actually building and testing artifacts. In the latter case, the download should really only be from an Apache

New to dev community | Contribution to Mlib

2017-09-14 Thread Venali Sonone
Hello, I am new to dev community of Spark and also open source in general but have used Spark extensively. I want to create a complete part on anomaly detection in spark Mlib, For the same I want to know if someone could guide me so i can start the development and contribute to Spark Mlib. Sorry

Re: What is d3kbcqa49mib13.cloudfront.net ?

2017-09-14 Thread Wenchen Fan
That test case is trying to test the backward compatibility of `HiveExternalCatalog`. It downloads official Spark releases and creates tables with them, and then read these tables via the current Spark. About the download link, I just picked it from the Spark website, and this link is the default

Futures timeout exception in executor logs

2017-09-14 Thread Simon Scott
Hi, Just wondering if anybody has any insights on this SPARK-14140: Futures timeout exception in executor logs ? We are seeing the exact same exception during a long-running iterative application on a Spark Standalone cluster, v 2.1. At the same time as the exception appears on an executor,