subject:"\[GitHub\] spark pull request\: \[SPARK\-15517\]\[SQL\]\[STREAMING\] Add support for ..."

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for complete o...

2016-05-31 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/13286


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for complete o...

2016-05-31 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/13286
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-86598
  
@rxin @marmbrus Are you okay with current OutputMode design? Add unit test 
for Java compatibility of OutputMode.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-85060
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-85061
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59542/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-85015
  
**[Test build #59542 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59542/consoleFull)**
 for PR 13286 at commit 
[`e951798`](https://github.com/apache/spark/commit/e951798bf4511cabc08d242e7f1a3d7d1e653263).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-81071
  
**[Test build #59542 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59542/consoleFull)**
 for PR 13286 at commit 
[`e951798`](https://github.com/apache/spark/commit/e951798bf4511cabc08d242e7f1a3d7d1e653263).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-80807
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59541/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-80806
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-80804
  
**[Test build #59541 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59541/consoleFull)**
 for PR 13286 at commit 
[`4784e18`](https://github.com/apache/spark/commit/4784e18efcc552fc99c2fd9d5ccfe1c78e479c79).
 * This patch **fails RAT tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-80740
  
**[Test build #59541 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59541/consoleFull)**
 for PR 13286 at commit 
[`4784e18`](https://github.com/apache/spark/commit/4784e18efcc552fc99c2fd9d5ccfe1c78e479c79).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64980368
  
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java 
---
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql;
+
+import org.apache.spark.annotation.Experimental;
+
+/**
+ * :: Experimental ::
+ *
+ * OutputMode is used to what data will be written to a streaming sink 
when there is
+ * new data available in a streaming DataFrame/Dataset.
+ *
+ * @since 2.0.0
+ */
+@Experimental
+public class OutputMode {
--- End diff --

Okay, writing it in Scala does not work because we need define  static 
methods on OutputMode ('Append()', etc.) as well as the OutputMode interface 
that singleton object Append will extend. To do this in Scala we have to do 
```
trait OutputMode
object OutputMode {
   def Append(): OutputMode = ???
}
```
But this makes the OutputMode static object unusable from Java, as you will 
have to call `OutputMode$.MODULE$.Append()`. 

So doing this in Java is the cleanest way for it to be usable in both Java 
and Scala. I am including a JavaOutputModeSuite as well. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64979710
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/ContinuousQueryManagerSuite.scala
 ---
@@ -237,15 +237,15 @@ class ContinuousQueryManagerSuite extends StreamTest 
with SharedSQLContext with
   try {
 val df = ds.toDF
 val metadataRoot =
-  Utils.createTempDir(namePrefix = 
"streaming.metadata").getCanonicalPath
-query = spark
-  .streams
-  .startQuery(
-StreamExecution.nextName,
-metadataRoot,
-df,
-new MemorySink(df.schema))
-  .asInstanceOf[StreamExecution]
+  Utils.createTempDir(namePrefix = 
"streaming.checkpoint").getCanonicalPath
+query =
+  df.write
+.format("memory")
+.queryName(s"query${Random.nextInt(10)}")
--- End diff --

Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64976955
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/InternalOutputModes.scala ---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+/**
+ * Internal helper class to generate objects representing various 
[[OutputMode]]s,
+ */
+private[sql] object InternalOutputModes {
--- End diff --

Based on offline discussion with @rxin, will do this move in a later PR 
once we decide globally which classes should be moved to what package.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64976764
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -500,6 +500,26 @@ def mode(self, saveMode):
 self._jwrite = self._jwrite.mode(saveMode)
 return self
 
+@since(2.0)
+def outputMode(self, outputMode):
+"""Specifies how data of a streaming DataFrame/Dataset is written 
to a streaming sink.
+
+Options include:
+
+* `append`:Only the new rows in the streaming DataFrame/Dataset 
will be written to
+   the sink
+* `complete`:All the rows in the streaming DataFrame/Dataset will 
be written to the sink
+   every time these is some updates
--- End diff --

I want to write something that makes sense generally, without understanding 
trigger and all. As is, since the trigger is optional, one does not need to 
know about triggers at all to start running stuff in structured streaming.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64976281
  
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java 
---
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql;
+
+import org.apache.spark.annotation.Experimental;
+
+/**
+ * :: Experimental ::
+ *
+ * OutputMode is used to what data will be written to a streaming sink 
when there is
+ * new data available in a streaming DataFrame/Dataset.
+ *
+ * @since 2.0.0
+ */
+@Experimental
+public class OutputMode {
--- End diff --

I believe they are equivalent when there are no implemented methods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64976140
  
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java 
---
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql;
+
+import org.apache.spark.annotation.Experimental;
+
+/**
+ * :: Experimental ::
+ *
+ * OutputMode is used to what data will be written to a streaming sink 
when there is
+ * new data available in a streaming DataFrame/Dataset.
+ *
+ * @since 2.0.0
+ */
+@Experimental
+public class OutputMode {
--- End diff --

We also need a trait OutputMode that InternalOutputModes.Append will 
extend. Should that be a Scala trait or Java interface?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64968245
  
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java 
---
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql;
+
+import org.apache.spark.annotation.Experimental;
+
+/**
+ * :: Experimental ::
+ *
+ * OutputMode is used to what data will be written to a streaming sink 
when there is
+ * new data available in a streaming DataFrame/Dataset.
+ *
+ * @since 2.0.0
+ */
+@Experimental
+public class OutputMode {
--- End diff --

one downside of writing this in java is that it doesn't show up in scaladocs


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64968033
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/InternalOutputModes.scala ---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+/**
+ * Internal helper class to generate objects representing various 
[[OutputMode]]s,
+ */
+private[sql] object InternalOutputModes {
--- End diff --

Moving them to org.apache.spark.streaming in catalyst.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64967749
  
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java 
---
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql;
+
+import org.apache.spark.annotation.Experimental;
+
+/**
+ * :: Experimental ::
+ *
+ * OutputMode is used to what data will be written to a streaming sink 
when there is
+ * new data available in a streaming DataFrame/Dataset.
+ *
+ * @since 2.0.0
+ */
+@Experimental
+public class OutputMode {
--- End diff --

But there are no tests written in java :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64967195
  
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java 
---
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql;
+
+import org.apache.spark.annotation.Experimental;
+
+/**
+ * :: Experimental ::
+ *
+ * OutputMode is used to what data will be written to a streaming sink 
when there is
+ * new data available in a streaming DataFrame/Dataset.
+ *
+ * @since 2.0.0
+ */
+@Experimental
+public class OutputMode {
--- End diff --

Sure. Just wanted to be extra-cautious for java-safety.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-54170
  
LGTM with a few comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64966959
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/InternalOutputModes.scala ---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+/**
+ * Internal helper class to generate objects representing various 
[[OutputMode]]s,
+ */
+private[sql] object InternalOutputModes {
--- End diff --

oh, mostly I don't think they need to be in `org.apache.spark.sql` they 
could live in catalyst too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64966816
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -77,7 +77,47 @@ final class DataFrameWriter private[sql](df: DataFrame) {
   case "ignore" => SaveMode.Ignore
   case "error" | "default" => SaveMode.ErrorIfExists
   case _ => throw new IllegalArgumentException(s"Unknown save mode: 
$saveMode. " +
-"Accepted modes are 'overwrite', 'append', 'ignore', 'error'.")
+"Accepted save modes are 'overwrite', 'append', 'ignore', 
'error'.")
--- End diff --

I was thinking the same. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64966890
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/ContinuousQueryManagerSuite.scala
 ---
@@ -237,15 +237,15 @@ class ContinuousQueryManagerSuite extends StreamTest 
with SharedSQLContext with
   try {
 val df = ds.toDF
 val metadataRoot =
-  Utils.createTempDir(namePrefix = 
"streaming.metadata").getCanonicalPath
-query = spark
-  .streams
-  .startQuery(
-StreamExecution.nextName,
-metadataRoot,
-df,
-new MemorySink(df.schema))
-  .asInstanceOf[StreamExecution]
+  Utils.createTempDir(namePrefix = 
"streaming.checkpoint").getCanonicalPath
+query =
+  df.write
+.format("memory")
+.queryName(s"query${Random.nextInt(10)}")
--- End diff --

Will make it a counter. :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64966732
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/InternalOutputModes.scala ---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+/**
+ * Internal helper class to generate objects representing various 
[[OutputMode]]s,
+ */
+private[sql] object InternalOutputModes {
--- End diff --

They are in catalyst. Still move it to execution.streaming?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64966038
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/InternalOutputModes.scala ---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql
+
+/**
+ * Internal helper class to generate objects representing various 
[[OutputMode]]s,
+ */
+private[sql] object InternalOutputModes {
--- End diff --

If this is going to be internal we should probably just move it to 
`execution.streaming`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64962858
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/ContinuousQueryManagerSuite.scala
 ---
@@ -237,15 +237,15 @@ class ContinuousQueryManagerSuite extends StreamTest 
with SharedSQLContext with
   try {
 val df = ds.toDF
 val metadataRoot =
-  Utils.createTempDir(namePrefix = 
"streaming.metadata").getCanonicalPath
-query = spark
-  .streams
-  .startQuery(
-StreamExecution.nextName,
-metadataRoot,
-df,
-new MemorySink(df.schema))
-  .asInstanceOf[StreamExecution]
+  Utils.createTempDir(namePrefix = 
"streaming.checkpoint").getCanonicalPath
+query =
+  df.write
+.format("memory")
+.queryName(s"query${Random.nextInt(10)}")
--- End diff --

There could still be conflicts?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64962236
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -77,7 +77,47 @@ final class DataFrameWriter private[sql](df: DataFrame) {
   case "ignore" => SaveMode.Ignore
   case "error" | "default" => SaveMode.ErrorIfExists
   case _ => throw new IllegalArgumentException(s"Unknown save mode: 
$saveMode. " +
-"Accepted modes are 'overwrite', 'append', 'ignore', 'error'.")
+"Accepted save modes are 'overwrite', 'append', 'ignore', 
'error'.")
--- End diff --

We might consider aliasing `mode` as `saveMode` and deprecating `mode`.

/cc @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64962057
  
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java 
---
@@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql;
+
+import org.apache.spark.annotation.Experimental;
+
+/**
+ * :: Experimental ::
+ *
+ * OutputMode is used to what data will be written to a streaming sink 
when there is
+ * new data available in a streaming DataFrame/Dataset.
+ *
+ * @since 2.0.0
+ */
+@Experimental
+public class OutputMode {
--- End diff --

This doesn't have to be in `java`, see 
[Encoders.scala](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/Encoders.scala)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-27 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64961894
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -500,6 +500,26 @@ def mode(self, saveMode):
 self._jwrite = self._jwrite.mode(saveMode)
 return self
 
+@since(2.0)
+def outputMode(self, outputMode):
+"""Specifies how data of a streaming DataFrame/Dataset is written 
to a streaming sink.
+
+Options include:
+
+* `append`:Only the new rows in the streaming DataFrame/Dataset 
will be written to
+   the sink
+* `complete`:All the rows in the streaming DataFrame/Dataset will 
be written to the sink
+   every time these is some updates
--- End diff --

each time the trigger fires?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222074331
  
**[Test build #3024 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3024/consoleFull)**
 for PR 13286 at commit 
[`85ce263`](https://github.com/apache/spark/commit/85ce2638cf9c9150e2258749bb894a39779d24cc).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222063441
  
**[Test build #3024 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3024/consoleFull)**
 for PR 13286 at commit 
[`85ce263`](https://github.com/apache/spark/commit/85ce2638cf9c9150e2258749bb894a39779d24cc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222045976
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59444/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222045973
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222045851
  
**[Test build #59444 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59444/consoleFull)**
 for PR 13286 at commit 
[`85ce263`](https://github.com/apache/spark/commit/85ce2638cf9c9150e2258749bb894a39779d24cc).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64848139
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/MemorySinkSuite.scala ---
@@ -88,4 +244,15 @@ class MemorySinkSuite extends StreamTest with 
SharedSQLContext {
 .startStream()
 }
   }
+
+  private def checkAnswer(rows: Seq[Row], expected: Seq[Int])(implicit 
schema: StructType): Unit = {
+checkAnswer(
+  sqlContext.createDataFrame(sparkContext.makeRDD(rows), schema),
+  intsToDF(expected)(schema))
+  }
+
+  implicit def intsToDF(seq: Seq[Int])(implicit schema: StructType): 
DataFrame = {
--- End diff --

nit: add private


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64848125
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/ContinuousQueryManagerSuite.scala
 ---
@@ -237,15 +237,15 @@ class ContinuousQueryManagerSuite extends StreamTest 
with SharedSQLContext with
   try {
 val df = ds.toDF
 val metadataRoot =
-  Utils.createTempDir(namePrefix = 
"streaming.metadata").getCanonicalPath
-query = spark
-  .streams
-  .startQuery(
-StreamExecution.nextName,
-metadataRoot,
-df,
-new MemorySink(df.schema))
-  .asInstanceOf[StreamExecution]
+  Utils.createTempDir(namePrefix = 
"streaming.checkpoint").getCanonicalPath
+query =
+  df.write
+.format("memory")
+.queryName(s"query${Random.nextInt(10)}")
+.option("checkpointLocation", metadataRoot)
+.outputMode("append")
+.startStream()
+.asInstanceOf[StreamExecution]
--- End diff --

This change is to make more unit tests use memory format ==> more testing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222037903
  
**[Test build #59444 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59444/consoleFull)**
 for PR 13286 at commit 
[`85ce263`](https://github.com/apache/spark/commit/85ce2638cf9c9150e2258749bb894a39779d24cc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64846040
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -500,6 +500,28 @@ def mode(self, saveMode):
 self._jwrite = self._jwrite.mode(saveMode)
 return self
 
+@since(2.0)
+def outputMode(self, outputMode):
+"""Specifies how data of a streaming DataFrame/Dataset is written 
to a streaming sink.
+
+Options include:
+
+* `append`:Only the new rows in the streaming DataFrame/Dataset 
will be written to
+   the sink
+* `update`:Only the changed rows in the streaming 
DataFrame/Dataset will be written to
--- End diff --

good catch. fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222036201
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222036203
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59425/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222036038
  
**[Test build #59425 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59425/consoleFull)**
 for PR 13286 at commit 
[`369e9d5`](https://github.com/apache/spark/commit/369e9d5d58a762d9af794ed791410249d7032fa5).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64843274
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -500,6 +500,28 @@ def mode(self, saveMode):
 self._jwrite = self._jwrite.mode(saveMode)
 return self
 
+@since(2.0)
+def outputMode(self, outputMode):
+"""Specifies how data of a streaming DataFrame/Dataset is written 
to a streaming sink.
+
+Options include:
+
+* `append`:Only the new rows in the streaming DataFrame/Dataset 
will be written to
+   the sink
+* `update`:Only the changed rows in the streaming 
DataFrame/Dataset will be written to
--- End diff --

nit: remove this line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64843204
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -77,7 +77,47 @@ final class DataFrameWriter private[sql](df: DataFrame) {
   case "ignore" => SaveMode.Ignore
   case "error" | "default" => SaveMode.ErrorIfExists
   case _ => throw new IllegalArgumentException(s"Unknown save mode: 
$saveMode. " +
-"Accepted modes are 'overwrite', 'append', 'ignore', 'error'.")
+"Accepted save modes are 'overwrite', 'append', 'ignore', 
'error'.")
+}
+this
+  }
+
+  /**
+   * Specifies how data of a streaming DataFrame/Dataset is written to a 
streaming sink.
+   *   - `OutputMode.Append()`:   only the new rows in the streaming 
DataFrame/Dataset will be
+   *written to the sink
+   *   - `OutputMode.Complete()`: all the rows in the streaming 
DataFrame/Dataset will be written
+   *to the sink every time these is some 
updates
+   *
+   * @since 2.0.0
+   */
+  @Experimental
+  def outputMode(outputMode: OutputMode): DataFrameWriter = {
+assertStreaming("outputMode() can only be called on continuous 
queries")
+this.outputMode = outputMode
+this
+  }
+
+  /**
+   * Specifies how data of a streaming DataFrame/Dataset is written to a 
streaming sink.
+   *   - `append`:   only the new rows in the streaming DataFrame/Dataset 
will be written to
+   * the sink
+   *   - `complete`: all the rows in the streaming DataFrame/Dataset will 
be written to the sink
+   * every time these is some updates
+   *
+   * @since 2.0.0
+   */
+  @Experimental
+  def outputMode(outputMode: String): DataFrameWriter = {
--- End diff --

That we can decide later. That sounds too complicated to reason about right 
now when we have not even finalized how to specify Update mode. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64842979
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -77,7 +77,47 @@ final class DataFrameWriter private[sql](df: DataFrame) {
   case "ignore" => SaveMode.Ignore
   case "error" | "default" => SaveMode.ErrorIfExists
   case _ => throw new IllegalArgumentException(s"Unknown save mode: 
$saveMode. " +
-"Accepted modes are 'overwrite', 'append', 'ignore', 'error'.")
+"Accepted save modes are 'overwrite', 'append', 'ignore', 
'error'.")
+}
+this
+  }
+
+  /**
+   * Specifies how data of a streaming DataFrame/Dataset is written to a 
streaming sink.
+   *   - `OutputMode.Append()`:   only the new rows in the streaming 
DataFrame/Dataset will be
+   *written to the sink
+   *   - `OutputMode.Complete()`: all the rows in the streaming 
DataFrame/Dataset will be written
+   *to the sink every time these is some 
updates
+   *
+   * @since 2.0.0
+   */
+  @Experimental
+  def outputMode(outputMode: OutputMode): DataFrameWriter = {
+assertStreaming("outputMode() can only be called on continuous 
queries")
+this.outputMode = outputMode
+this
+  }
+
+  /**
+   * Specifies how data of a streaming DataFrame/Dataset is written to a 
streaming sink.
+   *   - `append`:   only the new rows in the streaming DataFrame/Dataset 
will be written to
+   * the sink
+   *   - `complete`: all the rows in the streaming DataFrame/Dataset will 
be written to the sink
+   * every time these is some updates
+   *
+   * @since 2.0.0
+   */
+  @Experimental
+  def outputMode(outputMode: String): DataFrameWriter = {
--- End diff --

@tdas do we need to think about how to support the `update` mode for this 
method? "update(columnName)"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64838480
  
--- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/OutputMode.java 
---
@@ -0,0 +1,29 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql;
+
+public class OutputMode {
--- End diff --

add docs



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64838491
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/InternalOutputModes.scala ---
@@ -15,9 +15,10 @@
  * limitations under the License.
  */
 
-package org.apache.spark.sql.catalyst.analysis
+package org.apache.spark.sql
 
-sealed trait OutputMode
-
-case object Append extends OutputMode
-case object Update extends OutputMode
+private[sql] object InternalOutputModes {
--- End diff --

add docs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222022412
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222022356
  
**[Test build #59425 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59425/consoleFull)**
 for PR 13286 at commit 
[`369e9d5`](https://github.com/apache/spark/commit/369e9d5d58a762d9af794ed791410249d7032fa5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222022414
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59420/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222022295
  
**[Test build #59420 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59420/consoleFull)**
 for PR 13286 at commit 
[`bbd6022`](https://github.com/apache/spark/commit/bbd6022cf3238a78604c8ad43df4c917d792f37f).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222019450
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222019440
  
**[Test build #59424 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59424/consoleFull)**
 for PR 13286 at commit 
[`4973621`](https://github.com/apache/spark/commit/4973621c23aac441c26c172a6e6322454b9797f7).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222019451
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59424/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222019179
  
**[Test build #59424 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59424/consoleFull)**
 for PR 13286 at commit 
[`4973621`](https://github.com/apache/spark/commit/4973621c23aac441c26c172a6e6322454b9797f7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-222009156
  
**[Test build #59420 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59420/consoleFull)**
 for PR 13286 at commit 
[`bbd6022`](https://github.com/apache/spark/commit/bbd6022cf3238a78604c8ad43df4c917d792f37f).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64796991
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulAggregate.scala
 ---
@@ -82,40 +82,62 @@ case class StateStoreRestoreExec(
 case class StateStoreSaveExec(
 keyExpressions: Seq[Attribute],
 stateId: Option[OperatorStateId],
+returnAllStates: Option[Boolean],
--- End diff --

Similar to stateId, it is injected later after creation of the physical 
Spark plan, and before execution. See IncrementalExecution, and how stateId is 
injected using the prepare rule. 

I agree that this structure is something we can improve, but that would 
require quite a bit of refactoring and better done later. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64796697
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala ---
@@ -33,7 +34,7 @@ object Utils {
   resultExpressions: Seq[NamedExpression],
   child: SparkPlan): Seq[SparkPlan] = {
 
-val completeAggregateExpressions = 
aggregateExpressions.map(_.copy(mode = Complete))
+val completeAggregateExpressions = 
aggregateExpressions.map(_.copy(mode = aggregate.Complete))
--- End diff --

removed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64796643
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -319,14 +362,19 @@ final class DataFrameWriter private[sql](df: 
DataFrame) {
 checkpointPath.toUri.toString
   }
 
-  val sink = new MemorySink(df.schema)
+  if (!Seq(OutputMode.Append, 
OutputMode.Complete).contains(outputMode)) {
--- End diff --

The tricky thing is that we want to make the Memory Sink compatible with 
update internally, but we may not want public API to support update mode yet.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-26 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64796289
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -500,6 +500,25 @@ def mode(self, saveMode):
 self._jwrite = self._jwrite.mode(saveMode)
 return self
 
+@since(2.0)
+def outputMode(self, outputMode):
+"""Specifies how data of a streaming DataFrame/Dataset is written 
to a streaming sink.
+
--- End diff --

done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221520294
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-25 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221520299
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59268/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-25 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221520107
  
**[Test build #59268 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59268/consoleFull)**
 for PR 13286 at commit 
[`58f88b8`](https://github.com/apache/spark/commit/58f88b8f829b7a1408b2ce471b4d5d0a23031ee7).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-25 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221503471
  
**[Test build #59268 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59268/consoleFull)**
 for PR 13286 at commit 
[`58f88b8`](https://github.com/apache/spark/commit/58f88b8f829b7a1408b2ce471b4d5d0a23031ee7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221464878
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221464882
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59253/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221464810
  
**[Test build #59253 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59253/consoleFull)**
 for PR 13286 at commit 
[`074299c`](https://github.com/apache/spark/commit/074299ca9bf04a3b14d9c54ba7fc2cd2b4bce94b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221457219
  
**[Test build #59253 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59253/consoleFull)**
 for PR 13286 at commit 
[`074299c`](https://github.com/apache/spark/commit/074299ca9bf04a3b14d9c54ba7fc2cd2b4bce94b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64502603
  
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/OutputMode.java 
---
@@ -15,9 +15,10 @@
  * limitations under the License.
  */
 
-package org.apache.spark.sql.catalyst.analysis
+package org.apache.spark.sql;
--- End diff --

before that i realize that making this java enum prevents us from 
having output modes like `UpdateInPlace("key")` in the future. So we have to 
think about this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221446192
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221446196
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59239/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221446102
  
**[Test build #59239 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59239/consoleFull)**
 for PR 13286 at commit 
[`3a79d41`](https://github.com/apache/spark/commit/3a79d41b14535ffe4f65595126614832a094bf9b).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221444874
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221444878
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59236/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221444771
  
**[Test build #59236 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59236/consoleFull)**
 for PR 13286 at commit 
[`bb0314d`](https://github.com/apache/spark/commit/bb0314d32dfd42ea6081427042e182706a83b6ff).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64499074
  
--- Diff: python/pyspark/sql/readwriter.py ---
@@ -500,6 +500,25 @@ def mode(self, saveMode):
 self._jwrite = self._jwrite.mode(saveMode)
 return self
 
+@since(2.0)
+def outputMode(self, outputMode):
+"""Specifies how data of a streaming DataFrame/Dataset is written 
to a streaming sink.
+
--- End diff --

nit: add `.. note:: Experimental.`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64499026
  
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/OutputMode.java 
---
@@ -15,9 +15,10 @@
  * limitations under the License.
  */
 
-package org.apache.spark.sql.catalyst.analysis
+package org.apache.spark.sql;
 
-sealed trait OutputMode
-
-case object Append extends OutputMode
-case object Update extends OutputMode
+public enum OutputMode {
--- End diff --

nit: `@Experimental`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221442177
  
Finished my first round. Looks pretty good. Just some nits.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread marmbrus

Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64498668
  
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/OutputMode.java 
---
@@ -15,9 +15,10 @@
  * limitations under the License.
  */
 
-package org.apache.spark.sql.catalyst.analysis
+package org.apache.spark.sql;
 
-sealed trait OutputMode
-
-case object Append extends OutputMode
-case object Update extends OutputMode
+public enum OutputMode {
+  Append,
+  Update,
--- End diff --

This actually raises a good question.  I'm not sure if we can use enums 
here as I think that we need to have a notion of a `key` in order to do an 
`Update` mode.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64498404
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StatefulAggregate.scala
 ---
@@ -82,40 +82,62 @@ case class StateStoreRestoreExec(
 case class StateStoreSaveExec(
 keyExpressions: Seq[Attribute],
 stateId: Option[OperatorStateId],
+returnAllStates: Option[Boolean],
--- End diff --

Why is returnAllStates not just `Boolean`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64496407
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala
 ---
@@ -35,15 +35,36 @@ class IncrementalExecution private[sql](
 val currentBatchId: Long)
   extends QueryExecution(sparkSession, logicalPlan) {
 
-  // TODO: make this always part of planning.
-  val stateStrategy = 
sparkSession.sessionState.planner.StatefulAggregationStrategy :: Nil
-
   // Modified planner with stateful operations.
   override def planner: SparkPlanner =
 new SparkPlanner(
   sparkSession.sparkContext,
   sparkSession.sessionState.conf,
-  stateStrategy)
+  Nil) {
+
+  override def strategies: Seq[Strategy] = {
+StatefulAggregationStrategy +: super.strategies
+  }
+
+  /**
+   * Used to plan aggregation queries that are computed incrementally 
as part of a
+   * [[org.apache.spark.sql.ContinuousQuery]].
+   */
+  object StatefulAggregationStrategy extends Strategy {
+override def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match 
{
+  case PhysicalAggregation(
+namedGroupingExpressions, aggregateExpressions, 
rewrittenResultExpressions, child) =>
+execution.aggregate.Utils.planStreamingAggregation(
+  namedGroupingExpressions,
+  aggregateExpressions,
+  rewrittenResultExpressions,
+  outputMode,
--- End diff --

Nvm. I simplified the logic even further to avoid this refactoring. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64496153
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala ---
@@ -319,14 +362,19 @@ final class DataFrameWriter private[sql](df: 
DataFrame) {
 checkpointPath.toUri.toString
   }
 
-  val sink = new MemorySink(df.schema)
+  if (!Seq(OutputMode.Append, 
OutputMode.Complete).contains(outputMode)) {
--- End diff --

nit: maybe move this logic to the constructor of `MemorySink` so that we 
can make sure no place will pass a wrong OutputMode to MemorySink.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64495984
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala 
---
@@ -114,35 +114,48 @@ case class MemoryStream[A : Encoder](id: Int, 
sqlContext: SQLContext)
  * A sink that stores the results in memory. This [[Sink]] is primarily 
intended for use in unit
  * tests and does not provide durability.
  */
-class MemorySink(val schema: StructType) extends Sink with Logging {
+class MemorySink(val schema: StructType, outputMode: OutputMode) extends 
Sink with Logging {
+
+  private case class AddedData(batchId: Long, data: Array[Row])
+
   /** An order list of batches that have been written to this [[Sink]]. */
   @GuardedBy("this")
-  private val batches = new ArrayBuffer[Array[Row]]()
+  private val batches = new ArrayBuffer[AddedData]()
 
   /** Returns all rows that are stored in this [[Sink]]. */
   def allData: Seq[Row] = synchronized {
-batches.flatten
+batches.map(_.data).flatten
   }
 
-  def latestBatchId: Option[Int] = synchronized {
-if (batches.size == 0) None else Some(batches.size - 1)
+  def latestBatchId: Option[Long] = synchronized {
+batches.lastOption.map(_.batchId)
   }
 
-  def lastBatch: Seq[Row] = synchronized { batches.last }
+  def latestBatchData: Seq[Row] = synchronized { 
batches.lastOption.toSeq.flatten(_.data) }
 
   def toDebugString: String = synchronized {
-batches.zipWithIndex.map { case (b, i) =>
-  val dataStr = try b.mkString(" ") catch {
+batches.map { case AddedData(batchId, data) =>
+  val dataStr = try data.mkString(" ") catch {
 case NonFatal(e) => "[Error converting to string]"
   }
-  s"$i: $dataStr"
+  s"$batchId: $dataStr"
 }.mkString("\n")
   }
 
   override def addBatch(batchId: Long, data: DataFrame): Unit = 
synchronized {
-if (batchId == batches.size) {
-  logDebug(s"Committing batch $batchId")
-  batches.append(data.collect())
+if (latestBatchId.isEmpty || batchId > latestBatchId.get) {
+  logDebug(s"Committing batch $batchId to $this")
+  outputMode match {
+case OutputMode.Append | OutputMode.Update =>
+  batches.append(AddedData(batchId, data.collect()))
+
+case OutputMode.Complete =>
+  batches.clear()
+  batches.append(AddedData(batchId, data.collect()))
+
+case _ =>
+  throw new IllegalArgumentException("Data source ")
--- End diff --

thats a mistake, i left it incomplete. thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64495962
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala 
---
@@ -114,35 +114,48 @@ case class MemoryStream[A : Encoder](id: Int, 
sqlContext: SQLContext)
  * A sink that stores the results in memory. This [[Sink]] is primarily 
intended for use in unit
  * tests and does not provide durability.
  */
-class MemorySink(val schema: StructType) extends Sink with Logging {
+class MemorySink(val schema: StructType, outputMode: OutputMode) extends 
Sink with Logging {
+
+  private case class AddedData(batchId: Long, data: Array[Row])
+
   /** An order list of batches that have been written to this [[Sink]]. */
   @GuardedBy("this")
-  private val batches = new ArrayBuffer[Array[Row]]()
+  private val batches = new ArrayBuffer[AddedData]()
 
   /** Returns all rows that are stored in this [[Sink]]. */
   def allData: Seq[Row] = synchronized {
-batches.flatten
+batches.map(_.data).flatten
   }
 
-  def latestBatchId: Option[Int] = synchronized {
-if (batches.size == 0) None else Some(batches.size - 1)
+  def latestBatchId: Option[Long] = synchronized {
+batches.lastOption.map(_.batchId)
   }
 
-  def lastBatch: Seq[Row] = synchronized { batches.last }
+  def latestBatchData: Seq[Row] = synchronized { 
batches.lastOption.toSeq.flatten(_.data) }
 
   def toDebugString: String = synchronized {
-batches.zipWithIndex.map { case (b, i) =>
-  val dataStr = try b.mkString(" ") catch {
+batches.map { case AddedData(batchId, data) =>
+  val dataStr = try data.mkString(" ") catch {
 case NonFatal(e) => "[Error converting to string]"
   }
-  s"$i: $dataStr"
+  s"$batchId: $dataStr"
 }.mkString("\n")
   }
 
   override def addBatch(batchId: Long, data: DataFrame): Unit = 
synchronized {
-if (batchId == batches.size) {
-  logDebug(s"Committing batch $batchId")
-  batches.append(data.collect())
+if (latestBatchId.isEmpty || batchId > latestBatchId.get) {
+  logDebug(s"Committing batch $batchId to $this")
+  outputMode match {
+case OutputMode.Append | OutputMode.Update =>
--- End diff --

I am wondering whether we should support update mode for memory sink. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221436824
  
**[Test build #59239 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59239/consoleFull)**
 for PR 13286 at commit 
[`3a79d41`](https://github.com/apache/spark/commit/3a79d41b14535ffe4f65595126614832a094bf9b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64495685
  
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/OutputMode.java 
---
@@ -15,9 +15,10 @@
  * limitations under the License.
  */
 
-package org.apache.spark.sql.catalyst.analysis
+package org.apache.spark.sql;
--- End diff --

nit: move this file to 
sql/catalyst/src/main/**java**/org/apache/spark/sql/OutputMode.java


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64495492
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/ContinuousQueryManagerSuite.scala
 ---
@@ -237,15 +237,14 @@ class ContinuousQueryManagerSuite extends StreamTest 
with SharedSQLContext with
   try {
 val df = ds.toDF
 val metadataRoot =
-  Utils.createTempDir(namePrefix = 
"streaming.metadata").getCanonicalPath
-query = spark
-  .streams
-  .startQuery(
-StreamExecution.nextName,
-metadataRoot,
-df,
-new MemorySink(df.schema))
-  .asInstanceOf[StreamExecution]
+  Utils.createTempDir(namePrefix = 
"streaming.checkpoint").getCanonicalPath
+query =
+  df.write
+.format("memory")
+.option("checkpointLocation", "memory")
--- End diff --

nit: `"memory"` -> `metadataRoot`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64495145
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala 
---
@@ -114,35 +114,48 @@ case class MemoryStream[A : Encoder](id: Int, 
sqlContext: SQLContext)
  * A sink that stores the results in memory. This [[Sink]] is primarily 
intended for use in unit
  * tests and does not provide durability.
  */
-class MemorySink(val schema: StructType) extends Sink with Logging {
+class MemorySink(val schema: StructType, outputMode: OutputMode) extends 
Sink with Logging {
+
+  private case class AddedData(batchId: Long, data: Array[Row])
+
   /** An order list of batches that have been written to this [[Sink]]. */
   @GuardedBy("this")
-  private val batches = new ArrayBuffer[Array[Row]]()
+  private val batches = new ArrayBuffer[AddedData]()
 
   /** Returns all rows that are stored in this [[Sink]]. */
   def allData: Seq[Row] = synchronized {
-batches.flatten
+batches.map(_.data).flatten
   }
 
-  def latestBatchId: Option[Int] = synchronized {
-if (batches.size == 0) None else Some(batches.size - 1)
+  def latestBatchId: Option[Long] = synchronized {
+batches.lastOption.map(_.batchId)
   }
 
-  def lastBatch: Seq[Row] = synchronized { batches.last }
+  def latestBatchData: Seq[Row] = synchronized { 
batches.lastOption.toSeq.flatten(_.data) }
 
   def toDebugString: String = synchronized {
-batches.zipWithIndex.map { case (b, i) =>
-  val dataStr = try b.mkString(" ") catch {
+batches.map { case AddedData(batchId, data) =>
+  val dataStr = try data.mkString(" ") catch {
 case NonFatal(e) => "[Error converting to string]"
   }
-  s"$i: $dataStr"
+  s"$batchId: $dataStr"
 }.mkString("\n")
   }
 
   override def addBatch(batchId: Long, data: DataFrame): Unit = 
synchronized {
-if (batchId == batches.size) {
-  logDebug(s"Committing batch $batchId")
-  batches.append(data.collect())
+if (latestBatchId.isEmpty || batchId > latestBatchId.get) {
+  logDebug(s"Committing batch $batchId to $this")
+  outputMode match {
+case OutputMode.Append | OutputMode.Update =>
--- End diff --

nit: Since we don't support `OutputMode.Update`, could you remove it? I 
think it will have a different logic even if we add it in future.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64495148
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/memory.scala 
---
@@ -114,35 +114,48 @@ case class MemoryStream[A : Encoder](id: Int, 
sqlContext: SQLContext)
  * A sink that stores the results in memory. This [[Sink]] is primarily 
intended for use in unit
  * tests and does not provide durability.
  */
-class MemorySink(val schema: StructType) extends Sink with Logging {
+class MemorySink(val schema: StructType, outputMode: OutputMode) extends 
Sink with Logging {
+
+  private case class AddedData(batchId: Long, data: Array[Row])
+
   /** An order list of batches that have been written to this [[Sink]]. */
   @GuardedBy("this")
-  private val batches = new ArrayBuffer[Array[Row]]()
+  private val batches = new ArrayBuffer[AddedData]()
 
   /** Returns all rows that are stored in this [[Sink]]. */
   def allData: Seq[Row] = synchronized {
-batches.flatten
+batches.map(_.data).flatten
   }
 
-  def latestBatchId: Option[Int] = synchronized {
-if (batches.size == 0) None else Some(batches.size - 1)
+  def latestBatchId: Option[Long] = synchronized {
+batches.lastOption.map(_.batchId)
   }
 
-  def lastBatch: Seq[Row] = synchronized { batches.last }
+  def latestBatchData: Seq[Row] = synchronized { 
batches.lastOption.toSeq.flatten(_.data) }
 
   def toDebugString: String = synchronized {
-batches.zipWithIndex.map { case (b, i) =>
-  val dataStr = try b.mkString(" ") catch {
+batches.map { case AddedData(batchId, data) =>
+  val dataStr = try data.mkString(" ") catch {
 case NonFatal(e) => "[Error converting to string]"
   }
-  s"$i: $dataStr"
+  s"$batchId: $dataStr"
 }.mkString("\n")
   }
 
   override def addBatch(batchId: Long, data: DataFrame): Unit = 
synchronized {
-if (batchId == batches.size) {
-  logDebug(s"Committing batch $batchId")
-  batches.append(data.collect())
+if (latestBatchId.isEmpty || batchId > latestBatchId.get) {
+  logDebug(s"Committing batch $batchId to $this")
+  outputMode match {
+case OutputMode.Append | OutputMode.Update =>
+  batches.append(AddedData(batchId, data.collect()))
+
+case OutputMode.Complete =>
+  batches.clear()
+  batches.append(AddedData(batchId, data.collect()))
+
+case _ =>
+  throw new IllegalArgumentException("Data source ")
--- End diff --

nit: Although we won't reach here, let's still add a better message such as 
`s"Memory sink does not support output mode $outputMode"`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread zsxwing

Github user zsxwing commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64494863
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/ContinuousQueryManager.scala ---
@@ -175,9 +175,9 @@ class ContinuousQueryManager(sparkSession: 
SparkSession) {
   checkpointLocation: String,
   df: DataFrame,
   sink: Sink,
+  outputMode: OutputMode,
   trigger: Trigger = ProcessingTime(0),
-  triggerClock: Clock = new SystemClock(),
-  outputMode: OutputMode = Append): ContinuousQuery = {
+  triggerClock: Clock = new SystemClock()): ContinuousQuery = {
--- End diff --

I just realized we should make the constructor of `ContinuousQueryManager` 
be `private[sql]` since we don't want people to call it. Could you also fix it 
in this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221435335
  
**[Test build #59236 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59236/consoleFull)**
 for PR 13286 at commit 
[`bb0314d`](https://github.com/apache/spark/commit/bb0314d32dfd42ea6081427042e182706a83b6ff).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221434083
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221434079
  
**[Test build #59235 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59235/consoleFull)**
 for PR 13286 at commit 
[`a6e2bb5`](https://github.com/apache/spark/commit/a6e2bb5b6f0838b18c84086d671b518701cbc26a).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class ListFilesCommand(files: Seq[String] = Seq.empty[String]) 
extends RunnableCommand `
  * `case class ListJarsCommand(jars: Seq[String] = Seq.empty[String]) 
extends RunnableCommand `


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221434090
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59235/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221433717
  
**[Test build #59235 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59235/consoleFull)**
 for PR 13286 at commit 
[`a6e2bb5`](https://github.com/apache/spark/commit/a6e2bb5b6f0838b18c84086d671b518701cbc26a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread tdas

Github user tdas commented on a diff in the pull request:

https://github.com/apache/spark/pull/13286#discussion_r64492042
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingAggregationSuite.scala
 ---
@@ -67,6 +64,43 @@ class StreamingAggregationSuite extends StreamTest with 
SharedSQLContext with Be
 )
   }
 
+  test("simple count, complete mode") {
+val inputData = MemoryStream[Int]
+
+val aggregated =
+  inputData.toDF()
+.groupBy($"value")
+.agg(count("*"))
+.as[(Int, Long)]
+
+testStream(aggregated, OutputMode.Complete)(
+  AddData(inputData, 3),
+  CheckLastBatch((3, 1)),
+  AddData(inputData, 2),
+  CheckLastBatch((3, 1), (2, 1)),
+  StopStream,
+  StartStream(),
+  AddData(inputData, 3, 2, 1),
+  CheckLastBatch((3, 2), (2, 2), (1, 1)),
+  AddData(inputData, 4, 4, 4, 4),
+  CheckLastBatch((4, 4), (3, 2), (2, 2), (1, 1))
+)
+  }
+
+  test("simple count, append mode") {
+val inputData = MemoryStream[Int]
+
+val aggregated =
+  inputData.toDF()
+.groupBy($"value")
+.agg(count("*"))
+.as[(Int, Long)]
+
+intercept[AnalysisException] {
--- End diff --

add checks on message


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221429134
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221429136
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/59232/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-15517][SQL][STREAMING] Add support for ...

2016-05-24 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/13286#issuecomment-221429129
  
**[Test build #59232 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/59232/consoleFull)**
 for PR 13286 at commit 
[`61af057`](https://github.com/apache/spark/commit/61af0573a112a54bab05c070de7a36b8c74703dc).
 * This patch **fails Python style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 >

1 - 100 of 109 matches

Mail list logo