Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/5604#issuecomment-97518465
Hi,
I have been experimenting with Window functions in Spark SQL as well. It
has been partially based on this. You can find my work
[here](https://github.com
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/6104#issuecomment-101519756
Hi,
In the JIRA the following examples is given:
```
df.select(
df.store,
df.date,
df.sales,
avg(df.sales).over.partitionBy
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/6278
[SPARK-7712] [SQL] Move Window Functions from Hive UDAFS to Spark Native
backend [WIP]
This PR aims to improve the current window functionality in Spark SQL in
the following ways:
- Moving
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/6278#issuecomment-110769725
Thanks for starting the test process.
I'll take a look at the merge issues today.
---
If your project is set up for it, you can reply to this email and have
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/7057
[SPARK-8638] [SQL] Window Function Performance Improvements
## Description
Performance improvements for Spark Window functions. This PR will also
serve as the basis for moving away from Hive
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/6278#issuecomment-111584747
A small status update:
- The PR currently passes all tests.
- The code has been rebased, but this is a moving target due to the
frequent commits to the SQL
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r33641164
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -37,443 +59,622 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r33641261
--- Diff:
sql/hive/compatibility/src/test/scala/org/apache/spark/sql/hive/execution/HiveWindowFunctionQuerySuite.scala
---
@@ -749,7 +749,7 @@ abstract
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r33641591
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveDataFrameWindowSuite.scala
---
@@ -189,7 +189,7 @@ class HiveDataFrameWindowSuite extends
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r33641018
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -19,17 +19,39 @@ package org.apache.spark.sql.execution
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r33642234
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -37,443 +59,622 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r33642464
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -19,17 +19,39 @@ package org.apache.spark.sql.execution
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8113#discussion_r36873052
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -130,24 +147,67 @@ case class First
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8113#discussion_r36873218
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -130,24 +147,67 @@ case class First
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8113#discussion_r36818586
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -130,23 +138,36 @@ case class First
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8113#issuecomment-130343660
One more small thing. We should probably also add the ```ignoreNulls```
option to the ```first``` and ```last``` dataframe functions.
---
If your project is set up
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/8298
[SPARK-10100] [SQL] Perfomance improvements to new MIN/MAX aggregate
functions.
The new MIN/MAX suffer from a performance regression. This PR aims to fix
this by simplifying the evaluation
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/8362
[SPARK-9741][SQL] Approximate Count Distinct using the new UDAF interface.
This PR implements a HyperLogLog based Approximate Count Distinct function
using the new UDAF interface
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8362#issuecomment-133566608
Thanks.
I was aiming for compatibility with the existing approxCountDistinct, but
we can also implement HLL++. HLL++ introduces three (orthogonal
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7515#discussion_r35946650
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
---
@@ -818,7 +818,8 @@ class Analyzer
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7940#discussion_r36343368
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastNestedLoopJoin.scala
---
@@ -47,7 +47,7 @@ case class
Github user hvanhovell closed the pull request at:
https://github.com/apache/spark/pull/6278
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/6278#issuecomment-126397148
This PR is redundant now. See SPARK-8638 SPARK-8640 and SPARK-8641 for the
proposed implementation.
---
If your project is set up for it, you can reply
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7904#discussion_r36487283
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoin.scala
---
@@ -56,117 +52,247 @@ case class SortMergeJoin
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/7942
[SPARK-9357][SQL] Remove JoinedRow/Introduce JoinedProjection [WIP]
```JoinedRow```'s are used to join two rows together, and are used a lot of
the most performance critical sections of Spark
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8209#issuecomment-131206616
I have replaced ```p/``` tags by ```p``` in all java files I could find
them in. I haven't touched the ```BytesToByteMap```, ```Bin``` and
```PagedTable``` classes
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8209#issuecomment-131210873
I'd rather leave the scala source alone for now.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8209#issuecomment-131198780
I'll change those as well. It is strange that these didn't create problems;
I guess it has something to do with the position of the in the line.
---
If your
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8209#discussion_r37103311
--- Diff:
launcher/src/main/java/org/apache/spark/launcher/SparkLauncher.java ---
@@ -211,10 +211,10 @@ public SparkLauncher addSparkArg(String arg
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/8209
[SPARK-9980][BUILD] Fix SBT publishLocal error due to invalid characters in
doc
Tiny modification to a few comments ```sbt publishLocal``` work again.
You can merge this pull request into a Git
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8216#discussion_r37136449
--- Diff: core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala ---
@@ -17,6 +17,10 @@
package org.apache.spark.scheduler
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8216#discussion_r37137085
--- Diff: core/src/main/scala/org/apache/spark/scheduler/JobWaiter.scala ---
@@ -17,6 +17,10 @@
package org.apache.spark.scheduler
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8113#discussion_r36818762
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -130,23 +138,36 @@ case class First
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8113#discussion_r36818719
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -130,23 +138,36 @@ case class First
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8113#issuecomment-130128926
Besides the ```valueSet``` update/merge expressions LGTM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8113#issuecomment-130128263
LGTM. One final question, shouldn't we introduce a ```skipNulls```
parameter? Or do you want to address this in a follow-up PR?
---
If your project is set up
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8113#discussion_r36878009
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -130,24 +147,67 @@ case class First
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8113#discussion_r36877100
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -130,24 +147,67 @@ case class First
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/7715
[SPARK-8641][SPARK-8712][SQL] Native Spark Window Functions [WIP]
This replaces the Hive UDAFs with native Spark SQL UDAFs using the new UDAF
interface. See the JIRA ticket for more information
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7715#issuecomment-125431976
cc @yhuai
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7417#discussion_r35433399
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -213,10 +213,51 @@ private[sql] abstract class
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7417#issuecomment-124562458
@Sephiroth-Lin The performance improvement sounds really good. It seems
like a good thing to put in Spark.
---
If your project is set up for it, you can reply
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7417#discussion_r35433038
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -213,10 +213,51 @@ private[sql] abstract class
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34844383
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -38,443 +84,661 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7379#discussion_r34891100
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala
---
@@ -0,0 +1,411 @@
+/*
+ * Licensed
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7417#discussion_r35422390
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastCartesianProduct.scala
---
@@ -0,0 +1,80 @@
+/*
+ * Licensed
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-121250819
Current test errors are a bit weird. They shouldn't have been caused by
this change, because the functionality is disabled by default.
Rebased to most recent
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34602077
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -37,443 +59,622 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34603151
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -37,443 +59,622 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34616177
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -38,443 +68,645 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34617761
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -38,443 +68,645 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34618909
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -38,443 +68,645 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34857571
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/WindowSuite.scala
---
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7057#issuecomment-122158569
@yhuai the benchmarking results are attached. It might be interesting to
see how the operator performs on different datasets.
---
If your project is set up
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-122177553
The = case is quite easy to implement.
This implementation is currently targetted at range joining a rather small
(broadcastable) to an arbitrarily large
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7379#discussion_r34862439
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastRangeJoin.scala
---
@@ -0,0 +1,411 @@
+/*
+ * Licensed
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7057#discussion_r34844502
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/Window.scala ---
@@ -38,443 +68,645 @@ case class Window(
child: SparkPlan
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7379#issuecomment-12257
No problem.
### Supporting N-Ary Predicates.
In order to make the range join work we need the predicates to define a
single interval for each side
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7513#issuecomment-122707719
@yhuai Don't think jenkins picked up your OK.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7513#issuecomment-122674419
cc @yhuai
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/7513
[SPARK-8638] [SQL] Window Function Performance Improvements - Cleanup
This PR contains a few clean-ups that are a part of SPARK-8638: a few style
issues got fixed, and a few tests were moved
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7515#issuecomment-122912351
cc @yhuai
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/7379
[SPARK-8682][SQL][WIP] Range Join
*...copied from JIRA (SPARK-8682):*
Currently Spark SQL uses a Broadcast Nested Loop join (or a filtered
Cartesian Join) when it has to execute
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/7417#issuecomment-123296087
Do you have any benchmarking results for this? Would be great to see how
much this improves the current situation.
---
If your project is set up for it, you can
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7417#discussion_r35098517
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastCartesianProduct.scala
---
@@ -0,0 +1,80 @@
+/*
+ * Licensed
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/7417#discussion_r34681979
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/joins/CartesianProduct.scala
---
@@ -34,7 +34,15 @@ case class CartesianProduct(left
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8383#discussion_r37883244
--- Diff:
unsafe/src/main/java/org/apache/spark/unsafe/bitset/BitSetMethods.java ---
@@ -68,6 +68,19 @@ public static boolean isSet(Object baseObject
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/8298#issuecomment-134620114
@adrian-wang the improvement is absolutely tiny, about 2-3% if you do a lot
of ```min```'s of ```max```'es.
This PR was a response to misdiagnosed
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9280#discussion_r43067869
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/distinctFallback.scala
---
@@ -0,0 +1,173
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9280#discussion_r43116722
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/distinctFallback.scala
---
@@ -0,0 +1,173
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8587#discussion_r43439837
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -524,6 +525,133 @@ case class Sum(child
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/8587#discussion_r43162271
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/aggregate/functions.scala
---
@@ -524,6 +525,133 @@ case class Sum(child
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/9339
[SPARK-11388][Build]Fix self closing tags.
Java 8 javadoc does not like self closing tags: ``, ``, ...
This PR fixes those.
You can merge this pull request into a Git repository
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9339#issuecomment-152135261
jenkins retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9339#issuecomment-152135237
Hmmm, test fails with the following beautiful error:
[error] (core/test:test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/9280
[SPARK-9241] [SQL] [WIP] Supporting multiple DISTINCT columns
This PR adds support for multiple distinct columns to the new aggregation
code path.
The implementation uses
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/9568
[SPARK-11594][SQL][REPL] Cannot create UDAF in REPL
This PR enables users to create a UDAF in the REPL without getting a
```java.lang.InternalError```.
You can merge this pull request
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9556#discussion_r44269743
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/expressions/WindowSpec.scala ---
@@ -141,40 +141,46 @@ class WindowSpec private[sql
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9568#discussion_r44273796
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/expressions/udaf.scala ---
@@ -129,6 +129,13 @@ abstract class UserDefinedAggregateFunction extends
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9556#issuecomment-155054384
Made a quick pass. I think the PR is in good shape. I do have a few
aditional questions/remarks:
* Maybe should add some documentation to ```DeclarativeAggregate
Github user hvanhovell closed the pull request at:
https://github.com/apache/spark/pull/9568
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9568#issuecomment-155710292
Move to scala 2.10.5 fixed this. Closing PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9568#discussion_r44469882
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/udaf.scala ---
@@ -452,7 +452,7 @@ private[sql] case class ScalaUDAF
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9602#issuecomment-155576674
Does a REPL defined class blow up with a ```java.lang.InternalError```? If
it does, then we have the same problem:
https://github.com/apache/spark/pull/9568
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9566#issuecomment-155578114
I'll rebase.
The potential bug is caused by the fact that the attributes for the
distinct columns and the expressions for the distinct columns can possibly
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9566#discussion_r44514711
--- Diff:
sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/AggregationQuerySuite.scala
---
@@ -545,19 +576,21 @@ abstract class
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9566#discussion_r44472817
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DistinctAggregationRewriter.scala
---
@@ -151,11 +151,12 @@ case class
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9568#issuecomment-155589310
Sounds like a good idea. I'll add this in the morning.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9568#issuecomment-155582285
This is actually a scala problem:
https://issues.scala-lang.org/browse/SI-9051
---
If your project is set up for it, you can reply to this email and have your
reply
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9409#issuecomment-154825955
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9409#issuecomment-154826382
Jenkins does not like me...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9409#issuecomment-154831183
@yhuai can you get jenkins to test this?
The bug exposed by this patch affected the regular aggregation path, as
soon as we used more than one regular
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9409#issuecomment-154825193
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9556#issuecomment-154984531
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9642#issuecomment-156693107
```UnsafeRow``` and ```SpecificRow``` have similar problems. Shouldn't we
fix those as well? For example:
import org.apache.spark.sql.types.IntegerType
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9642#issuecomment-156697230
> You mean discard change and only update documentation? Yes this is one
option - I even think that it is not so bad.
I would only update the documentat
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9556#issuecomment-155225000
PySpark results are correct, just not in the correct form :(...
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user hvanhovell commented on the pull request:
https://github.com/apache/spark/pull/9556#issuecomment-155224746
LGTM, pending a successful test build.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user hvanhovell commented on a diff in the pull request:
https://github.com/apache/spark/pull/9556#discussion_r44343490
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -146,148 +146,105 @@ private[sql] abstract class
GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/9566
[SPARK-9241][SQL] Supporting multiple DISTINCT columns - follow-up (3)
This PR is a 2nd follow-up for
[SPARK-9241](https://issues.apache.org/jira/browse/SPARK-9241). It contains the
following
1 - 100 of 4165 matches
Mail list logo