spark git commit: [SPARK-9294][SQL] cleanup comments, code style, naming typo for the new aggregation

2015-07-23 Thread rxin
Repository: spark Updated Branches: refs/heads/master d4d762f27 -> 408e64b28 [SPARK-9294][SQL] cleanup comments, code style, naming typo for the new aggregation fix some comments and code style for https://github.com/apache/spark/pull/7458 Author: Wenchen Fan Closes #7619 from cloud-fan/ag

spark git commit: [SPARK-9200][SQL] Don't implicitly cast non-atomic types to string type.

2015-07-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 408e64b28 -> cb8c241f0 [SPARK-9200][SQL] Don't implicitly cast non-atomic types to string type. Author: Reynold Xin Closes #7636 from rxin/complex-string-implicit-cast and squashes the following commits: 3e67327 [Reynold Xin

spark git commit: [build] Enable memory leak detection for Tungsten.

2015-07-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master cb8c241f0 -> 8fe32b4f7 [build] Enable memory leak detection for Tungsten. This was turned off accidentally in #7591. Author: Reynold Xin Closes #7637 from rxin/enable-mem-leak-detect and squashes the following commits: 34bc

[2/2] spark git commit: [SPARK-9285][SQL] Remove InternalRow's inheritance from Row.

2015-07-24 Thread rxin
[SPARK-9285][SQL] Remove InternalRow's inheritance from Row. I also changed InternalRow's size/length function to numFields, to make it more obvious that it is not about bytes, but the number of fields. Author: Reynold Xin Closes #7626 from rxin/internalRow and squashes the followi

[1/2] spark git commit: [SPARK-9285][SQL] Remove InternalRow's inheritance from Row.

2015-07-24 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3aec9f4e2 -> 431ca39be http://git-wip-us.apache.org/repos/asf/spark/blob/431ca39b/sql/hive/src/test/scala/org/apache/spark/sql/sources/CommitFailureTestRelationSuite.scala -

spark git commit: [SPARK-9305] Rename org.apache.spark.Row to Item.

2015-07-24 Thread rxin
eveloper wants most of the time. Author: Reynold Xin Closes #7638 from rxin/remove-row and squashes the following commits: aeda52d [Reynold Xin] [SPARK-9305] Rename org.apache.spark.Row to Item. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/r

spark git commit: [SPARK-9330][SQL] Create specialized getStruct getter in InternalRow.

2015-07-24 Thread rxin
ses #7654 from rxin/getStruct and squashes the following commits: b491a09 [Reynold Xin] Fixed typo. 48d77e5 [Reynold Xin] [SPARK-9330][SQL] Create specialized getStruct getter in InternalRow. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/

spark git commit: [SPARK-9331][SQL] Add a code formatter to auto-format generated code.

2015-07-24 Thread rxin
ode automatically when we output them to the screen. Author: Reynold Xin Closes #7656 from rxin/codeformatter and squashes the following commits: 5ba0e90 [Reynold Xin] [SPARK-9331][SQL] Add a code formatter to auto-format generated code. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Com

spark git commit: [Spark-8668][SQL] Adding expr to functions

2015-07-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master 19bcd6ab1 -> 723db13e0 [Spark-8668][SQL] Adding expr to functions Author: JD Author: Joseph Batchik Closes #7606 from JDrit/expr and squashes the following commits: ad7f607 [Joseph Batchik] fixing python linter error 9d6daea [Joseph Bat

spark git commit: [SPARK-9336][SQL] Remove extra JoinedRows

2015-07-25 Thread rxin
ten variant of the operators. Author: Reynold Xin Closes #7659 from rxin/remove-joinedrows and squashes the following commits: 7510447 [Reynold Xin] [SPARK-9336][SQL] Remove extra JoinedRows Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/sp

spark git commit: [SPARK-9334][SQL] Remove UnsafeRowConverter in favor of UnsafeProjection.

2015-07-25 Thread rxin
Xin Closes #7658 from rxin/unsafeconverters and squashes the following commits: ed19e6c [Reynold Xin] Updated support types. 2a56d7e [Reynold Xin] [SPARK-9334][SQL] Remove UnsafeRowConverter in favor of UnsafeProjection. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-

spark git commit: [SPARK-9285] [SQL] Fixes Row/InternalRow conversion for HadoopFsRelation

2015-07-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master c980e20cf -> e2ec018e3 [SPARK-9285] [SQL] Fixes Row/InternalRow conversion for HadoopFsRelation This is a follow-up of #7626. It fixes `Row`/`InternalRow` conversion for data sources extending `HadoopFsRelation` with `needConversion` being

spark git commit: [SPARK-9192][SQL] add initialization phase for nondeterministic expression

2015-07-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master e2ec018e3 -> 2c94d0f24 [SPARK-9192][SQL] add initialization phase for nondeterministic expression Currently nondeterministic expression is broken without a explicit initialization phase. Let me take `MonotonicallyIncreasingID` as an examp

spark git commit: [SPARK-9348][SQL] Remove apply method on InternalRow.

2015-07-25 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2c94d0f24 -> b1f4b4abf [SPARK-9348][SQL] Remove apply method on InternalRow. Author: Reynold Xin Closes #7665 from rxin/remove-row-apply and squashes the following commits: 0b43001 [Reynold Xin] support getString in UnsafeRow. 176d

spark git commit: [SPARK-9350][SQL] Introduce an InternalRow generic getter that requires a DataType

2015-07-25 Thread rxin
hor: Reynold Xin Closes #7666 from rxin/generic-getter-with-datatype and squashes the following commits: ee2874c [Reynold Xin] Add a default implementation for getStruct. 1e109a0 [Reynold Xin] [SPARK-9350][SQL] Introduce an InternalRow generic getter that requires a DataType. 033ee88 [Reynold

spark git commit: [SPARK-9354][SQL] Remove InternalRow.get generic getter call in Hive integration code.

2015-07-26 Thread rxin
old Xin Closes #7669 from rxin/row-generic-getter-hive and squashes the following commits: 3467d8e [Reynold Xin] [SPARK-9354][SQL] Remove Internal.get generic getter call in Hive integration code. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/re

spark git commit: [SPARK-9356][SQL]Remove the internal use of DecimalType.Unlimited

2015-07-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6c400b4f3 -> fb5d43fb2 [SPARK-9356][SQL]Remove the internal use of DecimalType.Unlimited JIRA: https://issues.apache.org/jira/browse/SPARK-9356 Author: Yijie Shen Closes #7671 from yjshen/deprecated_unlimit and squashes the following com

[1/3] spark git commit: [SPARK-9095] [SQL] Removes the old Parquet support

2015-07-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6b2baec04 -> c025c3d0a http://git-wip-us.apache.org/repos/asf/spark/blob/c025c3d0/sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala -- diff --git a/sql/

[3/3] spark git commit: [SPARK-9095] [SQL] Removes the old Parquet support

2015-07-26 Thread rxin
[SPARK-9095] [SQL] Removes the old Parquet support This PR removes the old Parquet support: - Removes the old `ParquetRelation` together with related SQL configuration, plan nodes, strategies, utility classes, and test suites. - Renames `ParquetRelation2` to `ParquetRelation` - Renames `RowRea

[2/3] spark git commit: [SPARK-9095] [SQL] Removes the old Parquet support

2015-07-26 Thread rxin
http://git-wip-us.apache.org/repos/asf/spark/blob/c025c3d0/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableSupport.scala -- diff --git a/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableSupport.sca

spark git commit: [SPARK-8867][SQL] Support list / describe function usage

2015-07-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master c025c3d0a -> 1efe97dc9 [SPARK-8867][SQL] Support list / describe function usage As Hive does, we need to list all of the registered UDF and its usage for user. We add the annotation to describe a UDF, so we can get the literal description

spark git commit: [SPARK-9368][SQL] Support get(ordinal, dataType) generic getter in UnsafeRow.

2015-07-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master 945d8bcbf -> aa80c64fc [SPARK-9368][SQL] Support get(ordinal, dataType) generic getter in UnsafeRow. Author: Reynold Xin Closes #7682 from rxin/unsaferow-generic-getter and squashes the following commits: 3063788 [Reynold Xin] Re

spark git commit: [SPARK-9371][SQL] fix the support for special chars in column names for hive context

2015-07-26 Thread rxin
Repository: spark Updated Branches: refs/heads/master aa80c64fc -> 4ffd3a1db [SPARK-9371][SQL] fix the support for special chars in column names for hive context Author: Wenchen Fan Closes #7684 from cloud-fan/hive and squashes the following commits: da21ffe [Wenchen Fan] fix the support f

spark git commit: [SPARK-9371][SQL] fix SPARK-9371 for branch 1.4

2015-07-27 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.4 cfca1c5af -> 43035b4b4 [SPARK-9371][SQL] fix SPARK-9371 for branch 1.4 a follow up of https://github.com/apache/spark/pull/7684 Author: Wenchen Fan Closes #7690 from cloud-fan/branch-1.4 and squashes the following commits: 450904d [

spark git commit: [SPARK-9369][SQL] Support IntervalType in UnsafeRow

2015-07-27 Thread rxin
Repository: spark Updated Branches: refs/heads/master dd9ae7945 -> 75438422c [SPARK-9369][SQL] Support IntervalType in UnsafeRow Author: Wenchen Fan Closes #7688 from cloud-fan/interval and squashes the following commits: 5b36b17 [Wenchen Fan] fix codegen a99ed50 [Wenchen Fan] address comme

spark git commit: [HOTFIX] Disable pylint since it is failing master.

2015-07-27 Thread rxin
Repository: spark Updated Branches: refs/heads/master 75438422c -> 85a50a635 [HOTFIX] Disable pylint since it is failing master. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/85a50a63 Tree: http://git-wip-us.apache.org/r

spark git commit: Closes #7690 since it has been merged into branch-1.4.

2015-07-27 Thread rxin
Repository: spark Updated Branches: refs/heads/master 85a50a635 -> fa84e4a7b Closes #7690 since it has been merged into branch-1.4. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fa84e4a7 Tree: http://git-wip-us.apache.or

spark git commit: [SPARK-9349] [SQL] UDAF cleanup

2015-07-27 Thread rxin
Repository: spark Updated Branches: refs/heads/master fa84e4a7b -> 55946e76f [SPARK-9349] [SQL] UDAF cleanup https://issues.apache.org/jira/browse/SPARK-9349 With this PR, we only expose `UserDefinedAggregateFunction` (an abstract class) and `MutableAggregationBuffer` (an interface). Other i

spark git commit: [SPARK-9378] [SQL] Fixes test case "CTAS with serde"

2015-07-27 Thread rxin
Repository: spark Updated Branches: refs/heads/master 55946e76f -> 8e7d2bee2 [SPARK-9378] [SQL] Fixes test case "CTAS with serde" This is a proper version of PR #7693 authored by viirya The reason why "CTAS with serde" fails is that the `MetastoreRelation` gets converted to a Parquet data so

spark git commit: [SPARK-9355][SQL] Remove InternalRow.get generic getter call in columnar cache code

2015-07-27 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8e7d2bee2 -> 3ab7525dc [SPARK-9355][SQL] Remove InternalRow.get generic getter call in columnar cache code Author: Wenchen Fan Closes #7673 from cloud-fan/row-generic-getter-columnar and squashes the following commits: 88b1170 [Wenchen

spark git commit: [SPARK-8195] [SPARK-8196] [SQL] udf next_day last_day

2015-07-27 Thread rxin
rom adrian-wang/udfnlday and squashes the following commits: ef7e3da [Daoyuan Wang] fix 02b3426 [Daoyuan Wang] address 2 comments dc69630 [Daoyuan Wang] address comments from rxin 8846086 [Daoyuan Wang] address comments from rxin d09bcce [Daoyuan Wang] multi fix 1a9de3d [Daoyuan Wang] function next_day

spark git commit: [SPARK-9395][SQL] Create a SpecializedGetters interface to track all the specialized getters.

2015-07-27 Thread rxin
us prevent missing a method in some interfaces. Author: Reynold Xin Closes #7713 from rxin/SpecializedGetters and squashes the following commits: 3b39be1 [Reynold Xin] Added override modifier. 567ba9c [Reynold Xin] [SPARK-9395][SQL] Create a SpecializedGetters interface to track all

spark git commit: Fixed a test failure.

2015-07-27 Thread rxin
Repository: spark Updated Branches: refs/heads/master 84da8792e -> 3bc7055e2 Fixed a test failure. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/3bc7055e Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/3bc7055e D

spark git commit: [SPARK-9373][SQL] Support StructType in Tungsten projection

2015-07-27 Thread rxin
ect operator that projects data directly into UnsafeRow. Note that I'm not sure if this is the way we want to structure Unsafe+codegen operators, but we can defer that decision to follow-up pull requests. Author: Reynold Xin Closes #7689 from rxin/tungsten-struct-type and squashes the following

spark git commit: Closes #6836 since Round has already been implemented.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master d93ab93d6 -> fc3bd96bc Closes #6836 since Round has already been implemented. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fc3bd96b Tree: http://git-wip-us.apache.or

spark git commit: [SPARK-9394][SQL] Handle parentheses in CodeFormatter.

2015-07-28 Thread rxin
tch, it is formatted this way: ``` foo( a, b, c) ``` Author: Reynold Xin Closes #7712 from rxin/codeformat-parentheses and squashes the following commits: c2b1c5f [Reynold Xin] Took square bracket out 3cfb174 [Reynold Xin] Code review feedback. 91f5bb1 [Reynold Xin] [SPARK-9394][SQL] Han

spark git commit: [SPARK-9402][SQL] Remove CodegenFallback from Abs / FormatNumber.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4af622c85 -> 5a2330e54 [SPARK-9402][SQL] Remove CodegenFallback from Abs / FormatNumber. Both expressions already implement code generation. Author: Reynold Xin Closes #7723 from rxin/abs-formatnum and squashes the following comm

spark git commit: [SPARK-9373][SQL] follow up for StructType support in Tungsten projection.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5a2330e54 -> c740bed17 [SPARK-9373][SQL] follow up for StructType support in Tungsten projection. Author: Reynold Xin Closes #7720 from rxin/struct-followup and squashes the following commits: d9757f5 [Reynold Xin] [SPARK-9373][

spark git commit: [SPARK-8196][SQL] Fix null handling & documentation for next_day.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master c740bed17 -> 9bbe0171c [SPARK-8196][SQL] Fix null handling & documentation for next_day. The original patch didn't handle nulls correctly for next_day. Author: Reynold Xin Closes #7718 from rxin/next_day and squashes t

spark git commit: [SPARK-9196] [SQL] Ignore test DatetimeExpressionsSuite: function current_timestamp.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 31ec6a871 -> 6cdcc21fe [SPARK-9196] [SQL] Ignore test DatetimeExpressionsSuite: function current_timestamp. This test is flaky. https://issues.apache.org/jira/browse/SPARK-9196 will track the fix of it. For now, let's disable this test.

spark git commit: [SPARK-8003][SQL] Added virtual column support to Spark

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8d5bb5283 -> b88b868eb [SPARK-8003][SQL] Added virtual column support to Spark Added virtual column support by adding a new resolution role to the query analyzer. Additional virtual columns can be added by adding case expressions to [the

spark git commit: [SPARK-9420][SQL] Move expressions in sql/core package to catalyst.

2015-07-28 Thread rxin
s a followup of #7478. Author: Reynold Xin Closes #7735 from rxin/SPARK-8003 and squashes the following commits: 2ffbdc3 [Reynold Xin] [SPARK-8003][SQL] Move expressions in sql/core package to catalyst. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.

spark git commit: [SPARK-9418][SQL] Use sort-merge join as the default shuffle join.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master b7f54119f -> 6662ee212 [SPARK-9418][SQL] Use sort-merge join as the default shuffle join. Sort-merge join is more robust in Spark since sorting can be made using the Tungsten sort operator. Author: Reynold Xin Closes #7733 from r

spark git commit: [SPARK-8608][SPARK-8609][SPARK-9083][SQL] reset mutable states of nondeterministic expression before evaluation and fix PullOutNondeterministic

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3744b7fd4 -> 429b2f0df [SPARK-8608][SPARK-8609][SPARK-9083][SQL] reset mutable states of nondeterministic expression before evaluation and fix PullOutNondeterministic We will do local projection for LocalRelation, and thus reuse the same

spark git commit: [SPARK-9419] ShuffleMemoryManager and MemoryStore should track memory on a per-task, not per-thread, basis

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 429b2f0df -> ea49705bd [SPARK-9419] ShuffleMemoryManager and MemoryStore should track memory on a per-task, not per-thread, basis Spark's ShuffleMemoryManager and MemoryStore track memory on a per-thread basis, which causes problems in th

spark git commit: [SPARK-9281] [SQL] use decimal or double when parsing SQL

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6309b9346 -> 15667a0af [SPARK-9281] [SQL] use decimal or double when parsing SQL Right now, we use double to parse all the float number in SQL. When it's used in expression together with DecimalType, it will turn the decimal into double a

spark git commit: [SPARK-9251][SQL] do not order by expressions which still need evaluation

2015-07-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master 15667a0af -> 708794e8a [SPARK-9251][SQL] do not order by expressions which still need evaluation as an offline discussion with rxin , it's weird to be computing stuff while doing sorting, we should only order by bound referenc

spark git commit: [SPARK-9127][SQL] Rand/Randn codegen fails with long seed.

2015-07-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master 708794e8a -> 97906944e [SPARK-9127][SQL] Rand/Randn codegen fails with long seed. Author: Reynold Xin Closes #7747 from rxin/SPARK-9127 and squashes the following commits: e851418 [Reynold Xin] [SPARK-9127][SQL] Rand/Randn codegen fa

spark git commit: [SPARK-9430][SQL] Rename IntervalType to CalendarIntervalType.

2015-07-29 Thread rxin
ing IntervalType to CalendarIntervalType so we can do that in the future. Author: Reynold Xin Closes #7745 from rxin/calendarintervaltype and squashes the following commits: 99f64e8 [Reynold Xin] One more line ... 13466c8 [Reynold Xin] Fixed tests. e20f24e [Reynold Xin] [SPARK-9430][SQL] Ren

spark git commit: [SPARK-9411] [SQL] Make Tungsten page sizes configurable

2015-07-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master b715933fc -> 1b0099fc6 [SPARK-9411] [SQL] Make Tungsten page sizes configurable We need to make page sizes configurable so we can reduce them in unit tests and increase them in real production workloads. These sizes are now controlled by

spark git commit: [SPARK-9448][SQL] GenerateUnsafeProjection should not share expressions across instances.

2015-07-29 Thread rxin
und references. Author: Reynold Xin Closes #7759 from rxin/SPARK-9448 and squashes the following commits: c09b50f [Reynold Xin] [SPARK-9448][SQL] GenerateUnsafeProjection should not share expressions across instances. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://

spark git commit: [SPARK-9458] Avoid object allocation in prefix generation.

2015-07-29 Thread rxin
efix. We can use the specialized getters available on InternalRow directly to avoid the object allocation. I also removed the FLOAT prefix, opting for converting float directly to double. Author: Reynold Xin Closes #7763 from rxin/sort-prefix and squashes the following commits: 5dc2f06 [Rey

spark git commit: [SPARK-9460] Avoid byte array allocation in StringPrefixComparator.

2015-07-29 Thread rxin
the longs directly, rather than turning the longs into byte arrays and comparing them byte by byte (unsigned). This only works on little-endian architecture right now. Author: Reynold Xin Closes #7765 from rxin/SPARK-9460 and squashes the following commits: e4908cc [Reynold Xin] Stric

spark git commit: [SPARK-9462][SQL] Initialize nondeterministic expressions in code gen fallback mode.

2015-07-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master 07fd7d364 -> 27850af52 [SPARK-9462][SQL] Initialize nondeterministic expressions in code gen fallback mode. Author: Reynold Xin Closes #7767 from rxin/SPARK-9462 and squashes the following commits: ef3e2d9 [Reynold Xin] Removed prin

spark git commit: Fix reference to self.names in StructType

2015-07-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master 27850af52 -> f5dd11339 Fix reference to self.names in StructType `names` is not defined in this context, I think you meant `self.names`. davies Author: Alex Angelini Closes #7766 from angelini/fix_struct_type_names and squashes the foll

spark git commit: HOTFIX: disable HashedRelationSuite.

2015-07-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master e044705b4 -> 712465b68 HOTFIX: disable HashedRelationSuite. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/712465b6 Tree: http://git-wip-us.apache.org/repos/asf/spark/

spark git commit: [SPARK-8005][SQL] Input file name

2015-07-29 Thread rxin
Repository: spark Updated Branches: refs/heads/master e127ec34d -> 1221849f9 [SPARK-8005][SQL] Input file name Users can now get the file name of the partition being read in. A thread local variable is in `SQLNewHadoopRDD` and is set when the partition is computed. `SQLNewHadoopRDD` is moved

spark git commit: Revert "[SPARK-9458] Avoid object allocation in prefix generation."

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 76f2e393a -> 4a8bb9d00 Revert "[SPARK-9458] Avoid object allocation in prefix generation." This reverts commit 9514d874f0cf61f1eb4ec4f5f66e053119f769c9. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.

spark git commit: Fix flaky HashedRelationSuite

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4a8bb9d00 -> 5ba2d4406 Fix flaky HashedRelationSuite SparkEnv might not have been set in local unit tests. Author: Reynold Xin Closes #7784 from rxin/HashedRelationSuite and squashes the following commits: 435d64b [Reynold Xin]

spark git commit: [SPARK-9390][SQL] create a wrapper for array type

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7492a33fd -> c0cc0eaec [SPARK-9390][SQL] create a wrapper for array type Author: Wenchen Fan Closes #7724 from cloud-fan/array-data and squashes the following commits: d0408a1 [Wenchen Fan] fix python 661e608 [Wenchen Fan] rebase f39256c

spark git commit: [SPARK-8850] [SQL] Enable Unsafe mode by default

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master ab78b1d2a -> 520ec0ff9 [SPARK-8850] [SQL] Enable Unsafe mode by default This pull request enables Unsafe mode by default in Spark SQL. In order to do this, we had to fix a number of small issues: **List of fixed blockers**: - [x] Make so

spark git commit: [SPARK-9460] Fix prefix generation for UTF8String.

2015-07-30 Thread rxin
Xin Closes #7789 from rxin/utf8string and squashes the following commits: 86ffa3e [Reynold Xin] Mask out data outside of valid range. 4d647ed [Reynold Xin] Mask out data. c6e8794 [Reynold Xin] [SPARK-9460] Fix prefix generation for UTF8String. Project: http://git-wip-us.apache.org/repos/asf/spark/r

spark git commit: [SPARK-7157][SQL] add sampleBy to DataFrame

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master ca71cc8c8 -> df3266951 [SPARK-7157][SQL] add sampleBy to DataFrame This was previously committed but then reverted due to test failures (see #6769). Author: Xiangrui Meng Closes #7755 from rxin/SPARK-7157 and squashes the follow

spark git commit: [SPARK-9458][SPARK-9469][SQL] Code generate prefix computation in sorting & moves unsafe conversion out of TungstenSort.

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master df3266951 -> e7a0976e9 [SPARK-9458][SPARK-9469][SQL] Code generate prefix computation in sorting & moves unsafe conversion out of TungstenSort. Author: Reynold Xin Closes #7803 from rxin/SPARK-9458 and squashes the following

spark git commit: [SPARK-9425] [SQL] support DecimalType in UnsafeRow

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master e7a0976e9 -> 0b1a464b6 [SPARK-9425] [SQL] support DecimalType in UnsafeRow This PR brings the support of DecimalType in UnsafeRow, for precision <= 18, it's settable, otherwise it's not settable. Author: Davies Liu Closes #7758 from dav

spark git commit: [SPARK-6319][SQL] Throw AnalysisException when using BinaryType on Join and Aggregate

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 0b1a464b6 -> 351eda0e2 [SPARK-6319][SQL] Throw AnalysisException when using BinaryType on Join and Aggregate JIRA: https://issues.apache.org/jira/browse/SPARK-6319 Spark SQL uses plain byte arrays to represent binary values. However, the

spark git commit: [SPARK-8176] [SPARK-8197] [SQL] function to_date/ trunc

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9307f5653 -> 83670fc9e [SPARK-8176] [SPARK-8197] [SQL] function to_date/ trunc This PR is based on #6988 , thanks to adrian-wang . This brings two SQL functions: to_date() and trunc(). Closes #6988 Author: Daoyuan Wang Author: Davies Li

spark git commit: [SPARK-9152][SQL] Implement code generation for Like and RLike

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 69b62f76f -> 0244170b6 [SPARK-9152][SQL] Implement code generation for Like and RLike JIRA: https://issues.apache.org/jira/browse/SPARK-9152 This PR implements code generation for `Like` and `RLike`. Author: Liang-Chi Hsieh Closes #7561

spark git commit: [SPARK-9496][SQL]do not print the password in config

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/master 0244170b6 -> a3a85d73d [SPARK-9496][SQL]do not print the password in config https://issues.apache.org/jira/browse/SPARK-9496 We better do not print the password in log. Author: WangTaoTheTonic Closes #7815 from WangTaoTheTonic/master an

spark git commit: [SPARK-9496][SQL]do not print the password in config

2015-07-30 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.4 6e85064f4 -> 3d6a9214e [SPARK-9496][SQL]do not print the password in config https://issues.apache.org/jira/browse/SPARK-9496 We better do not print the password in log. Author: WangTaoTheTonic Closes #7815 from WangTaoTheTonic/maste

spark git commit: [SPARK-8271][SQL]string function: soundex

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3fc0cb920 -> 4d5a6e7b6 [SPARK-8271][SQL]string function: soundex This PR brings SQL function soundex(), see https://issues.apache.org/jira/browse/HIVE-9738 It's based on #7115 , thanks to HuJiayin Author: HuJiayin Author: Davies Liu

spark git commit: [SPARK-9358][SQL] Code generation for UnsafeRow joiner.

2015-07-31 Thread rxin
is inherently hard to test these low level stuff, the test suites employ randomized testing heavily in order to guarantee correctness. Author: Reynold Xin Closes #7821 from rxin/rowconcat and squashes the following commits: 8717f35 [Reynold Xin] Rebase and code review. 72c5d8e [Reynold Xin] Fixed a

spark git commit: [SPARK-8264][SQL]add substring_index function

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 03377d252 -> 6996bd2e8 [SPARK-8264][SQL]add substring_index function This PR is based on #7533 , thanks to zhichao-li Closes #7533 Author: zhichao.li Author: Davies Liu Closes #7843 from davies/str_index and squashes the following comm

spark git commit: [SPARK-9464][SQL] Property checks for UTF8String

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6996bd2e8 -> 14f263448 [SPARK-9464][SQL] Property checks for UTF8String This PR is based on the original work by JoshRosen in #7780, which adds ScalaCheck property-based tests for UTF8String. Author: Josh Rosen Author: Yijie Shen Close

spark git commit: [SPARK-9415][SQL] Throw AnalysisException when using MapType on Join and Aggregate

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 14f263448 -> 3320b0ba2 [SPARK-9415][SQL] Throw AnalysisException when using MapType on Join and Aggregate JIRA: https://issues.apache.org/jira/browse/SPARK-9415 Following up #7787. We shouldn't use MapType as grouping keys and join keys t

spark git commit: [SPARK-9517][SQL] BytesToBytesMap should encode data the same way as UnsafeExternalSorter

2015-07-31 Thread rxin
rds directly into UnsafeExternalSorter: ``` 4B key+value length, 4B key length, key data, value data ``` Author: Reynold Xin Closes #7845 from rxin/kvsort-rebase and squashes the following commits: 5716b59 [Reynold Xin] Fixed test. 2e62ccb [Reynold Xin] Updated BytesToBytesMap's data encodin

[2/2] spark git commit: [SPARK-9480][SQL] add MapData and cleanup internal row stuff

2015-08-01 Thread rxin
[SPARK-9480][SQL] add MapData and cleanup internal row stuff This PR adds a `MapData` as internal representation of map type in Spark SQL, and provides a default implementation with just 2 `ArrayData`. After that, we have specialized getters for all internal type, so I removed generic getter in

[1/2] spark git commit: [SPARK-9480][SQL] add MapData and cleanup internal row stuff

2015-08-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master d90f2cf7a -> 1d59a4162 http://git-wip-us.apache.org/repos/asf/spark/blob/1d59a416/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala -- diff --git a/sql/core/

spark git commit: [SPARK-9495] prefix of DateType/TimestampType

2015-08-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 84a6982b3 -> 5d9e33d9a [SPARK-9495] prefix of DateType/TimestampType cc rxin Author: Davies Liu Closes #7856 from davies/sort_improve and squashes the following commits: 5fc81bd [Davies Liu] support DateType/TimestampType Proj

spark git commit: [SPARK-9529] [SQL] improve TungstenSort on DecimalType

2015-08-01 Thread rxin
Repository: spark Updated Branches: refs/heads/master 28d944e86 -> 16b928c54 [SPARK-9529] [SQL] improve TungstenSort on DecimalType Generate prefix for DecimalType, fix the random generator of decimal cc JoshRosen Author: Davies Liu Closes #7857 from davies/sort_decimal and squashes the fo

spark git commit: [SPARK-9208][SQL] Sort DataFrame functions alphabetically.

2015-08-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 244016a95 -> 8eafa2aeb [SPARK-9208][SQL] Sort DataFrame functions alphabetically. Author: Reynold Xin Closes #7861 from rxin/api-audit and squashes the following commits: 7200256 [Reynold Xin] [SPARK-9208][SQL] Sort DataFrame functi

spark git commit: [SPARK-7937][SQL] Support comparison on StructType

2015-08-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 2e981b7bf -> 0722f4331 [SPARK-7937][SQL] Support comparison on StructType This brings #6519 up-to-date with master branch. Closes #6519. Author: Liang-Chi Hsieh Author: Liang-Chi Hsieh Author: Reynold Xin Closes #7877 from rxin/s

spark git commit: [SPARK-9543][SQL] Add randomized testing for UnsafeKVExternalSorter.

2015-08-02 Thread rxin
hly every 100 records. Author: Reynold Xin Closes #7873 from rxin/kvsorter-randomized-test and squashes the following commits: a08c251 [Reynold Xin] Resource cleanup. 0488b5c [Reynold Xin] [SPARK-9543][SQL] Add randomized testing for UnsafeKVExternalSorter. Project: http://git-wip-us.apache.

spark git commit: [SPARK-9535][SQL][DOCS] Modify document for codegen.

2015-08-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 9d03ad910 -> 536d2adc1 [SPARK-9535][SQL][DOCS] Modify document for codegen. #7142 made codegen enabled by default so let's modify the corresponding documents. Closes #7142 Author: KaiXinXiaoLei Author: Kousuke Saruta Closes #7863 from

spark git commit: [SPARK-9546][SQL] Centralize orderable data type checking.

2015-08-02 Thread rxin
in sorting. Author: Reynold Xin Closes #7880 from rxin/SPARK-9546 and squashes the following commits: f9e322d [Reynold Xin] Fixed tests. 0439b43 [Reynold Xin] [SPARK-9546][SQL] Centralize orderable data type checking. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://

spark git commit: [SPARK-9404][SPARK-9542][SQL] unsafe array data and map data

2015-08-02 Thread rxin
Repository: spark Updated Branches: refs/heads/master 687c8c371 -> 608353c8e [SPARK-9404][SPARK-9542][SQL] unsafe array data and map data This PR adds a UnsafeArrayData, current we encode it in this way: first 4 bytes is the # elements then each 4 byte is the start offset of the element, unle

spark git commit: [SPARK-9549][SQL] fix bugs in expressions

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 608353c8e -> 98d6d9c7a [SPARK-9549][SQL] fix bugs in expressions JIRA: https://issues.apache.org/jira/browse/SPARK-9549 This PR fix the following bugs: 1. `UnaryMinus`'s codegen version would fail to compile when the input is `Long.MinVa

[2/2] spark git commit: [SPARK-9240] [SQL] Hybrid aggregate operator using unsafe row

2015-08-03 Thread rxin
[SPARK-9240] [SQL] Hybrid aggregate operator using unsafe row This PR adds a base aggregation iterator `AggregationIterator`, which is used to create `SortBasedAggregationIterator` (for sort-based aggregation) and `UnsafeHybridAggregationIterator` (first it tries hash-based aggregation and fall

[1/2] spark git commit: [SPARK-9240] [SQL] Hybrid aggregate operator using unsafe row

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 98d6d9c7a -> 1ebd41b14 http://git-wip-us.apache.org/repos/asf/spark/blob/1ebd41b1/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/sortBasedIterators.scala -

spark git commit: [SPARK-9551][SQL] add a cheap version of copy for UnsafeRow to reuse a copy buffer

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 95dccc633 -> 137f47865 [SPARK-9551][SQL] add a cheap version of copy for UnsafeRow to reuse a copy buffer Author: Wenchen Fan Closes #7885 from cloud-fan/cheap-copy and squashes the following commits: 0900ca1 [Wenchen Fan] replace == wi

spark git commit: [SPARK-9518] [SQL] cleanup generated UnsafeRowJoiner and fix bug

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 137f47865 -> 191bf2689 [SPARK-9518] [SQL] cleanup generated UnsafeRowJoiner and fix bug Currently, when copy the bitsets, we didn't consider that the row1 may not sit in the beginning of byte array. cc rxin Author: Davies Liu

spark git commit: Two minor comments from code review on 191bf2689.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 191bf2689 -> 8be198c86 Two minor comments from code review on 191bf2689. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8be198c8 Tree: http://git-wip-us.apache.org/rep

Git Push Summary

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 [created] b41a32718 - To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SQL][minor] Simplify UnsafeRow.calculateBitSetWidthInBytes.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master dfe7bd168 -> 7a9d09f0b [SQL][minor] Simplify UnsafeRow.calculateBitSetWidthInBytes. Author: Reynold Xin Closes #7897 from rxin/calculateBitSetWidthInBytes and squashes the following commits: 2e73b3a [Reynold Xin] [SQL][minor] Simpl

spark git commit: [SQL][minor] Simplify UnsafeRow.calculateBitSetWidthInBytes.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 4de833e9e -> 5452e93f0 [SQL][minor] Simplify UnsafeRow.calculateBitSetWidthInBytes. Author: Reynold Xin Closes #7897 from rxin/calculateBitSetWidthInBytes and squashes the following commits: 2e73b3a [Reynold Xin] [SQL][mi

spark git commit: [SPARK-9554] [SQL] Enables in-memory partition pruning by default

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master 7a9d09f0b -> 703e44bff [SPARK-9554] [SQL] Enables in-memory partition pruning by default Author: Cheng Lian Closes #7895 from liancheng/spark-9554/enable-in-memory-partition-pruning and squashes the following commits: 67c403e [Cheng Lia

spark git commit: [SPARK-9554] [SQL] Enables in-memory partition pruning by default

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 5452e93f0 -> 6d46e9b7c [SPARK-9554] [SQL] Enables in-memory partition pruning by default Author: Cheng Lian Closes #7895 from liancheng/spark-9554/enable-in-memory-partition-pruning and squashes the following commits: 67c403e [Cheng

spark git commit: [SPARK-9558][DOCS]Update docs to follow the increase of memory defaults.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/master ff9169a00 -> ba1c4e138 [SPARK-9558][DOCS]Update docs to follow the increase of memory defaults. Now the memory defaults of master and slave in Standalone mode and History Server is 1g, not 512m. So let's update docs. Author: Kousuke Sarut

spark git commit: [SPARK-9558][DOCS]Update docs to follow the increase of memory defaults.

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 b3117d312 -> 444058d91 [SPARK-9558][DOCS]Update docs to follow the increase of memory defaults. Now the memory defaults of master and slave in Standalone mode and History Server is 1g, not 512m. So let's update docs. Author: Kousuke S

spark git commit: Revert "[SPARK-9372] [SQL] Filter nulls in join keys"

2015-08-03 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.5 29756ff11 -> db5832708 Revert "[SPARK-9372] [SQL] Filter nulls in join keys" This reverts commit 687c8c37150f4c93f8e57d86bb56321a4891286b. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org

  1   2   3   4   5   6   7   8   9   10   >