git commit: [SPARK-4107] Fix incorrect handling of read() and skip() return values

2014-10-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4ceb048b3 - 46c63417c [SPARK-4107] Fix incorrect handling of read() and skip() return values `read()` may return fewer bytes than requested; when this occurred, the old code would silently return less data than requested, which might

git commit: [SPARK-4116][YARN]Delete the abandoned log4j-spark-container.properties

2014-10-28 Thread tgraves
Repository: spark Updated Branches: refs/heads/master fae095bc7 - 47346cd02 [SPARK-4116][YARN]Delete the abandoned log4j-spark-container.properties Since its name reduced at https://github.com/apache/spark/pull/560, the log4j-spark-container.properties was never used again. And I have

git commit: [SPARK-4098][YARN]use appUIAddress instead of appUIHostPort in yarn-client mode

2014-10-28 Thread tgraves
Repository: spark Updated Branches: refs/heads/master e8813be65 - 0ac52e305 [SPARK-4098][YARN]use appUIAddress instead of appUIHostPort in yarn-client mode https://issues.apache.org/jira/browse/SPARK-4098 Author: WangTaoTheTonic barneystin...@aliyun.com Closes #2958 from

git commit: [SPARK-4110] Wrong comments about default settings in spark-daemon.sh

2014-10-28 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 7768a800d - 44d8b45a3 [SPARK-4110] Wrong comments about default settings in spark-daemon.sh In spark-daemon.sh, thare are following comments. # SPARK_CONF_DIR Alternate conf dir. Default is ${SPARK_PREFIX}/conf. #

git commit: [SPARK-4110] Wrong comments about default settings in spark-daemon.sh

2014-10-28 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.1 2ef2f5a7c - dee331738 [SPARK-4110] Wrong comments about default settings in spark-daemon.sh In spark-daemon.sh, thare are following comments. # SPARK_CONF_DIR Alternate conf dir. Default is ${SPARK_PREFIX}/conf. #

git commit: [SPARK-4107] Fix incorrect handling of read() and skip() return values (branch-1.1 backport)

2014-10-28 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.1 dee331738 - 286f1efb0 [SPARK-4107] Fix incorrect handling of read() and skip() return values (branch-1.1 backport) `read()` may return fewer bytes than requested; when this occurred, the old code would silently return less data than

git commit: [SPARK-4096][YARN]let ApplicationMaster accept executor memory argument in same format as JVM memory strings

2014-10-28 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 44d8b45a3 - 1ea3e3dc9 [SPARK-4096][YARN]let ApplicationMaster accept executor memory argument in same format as JVM memory strings Here `ApplicationMaster` accept executor memory argument only in number format, we should let it accept

git commit: [SPARK-4065] Add check for IPython on Windows

2014-10-28 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.1 286f1efb0 - f0c571760 [SPARK-4065] Add check for IPython on Windows This issue employs logic similar to the bash launcher (pyspark) to check if IPTYHON=1, and if so launch ipython with options in IPYTHON_OPTS. This fix assumes that

git commit: [SPARK-3814][SQL] Support for Bitwise AND(), OR(|) , XOR(^), NOT(~) in Spark HQL and SQL

2014-10-28 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 6c1b981c3 - 5807cb40a [SPARK-3814][SQL] Support for Bitwise AND(), OR(|) ,XOR(^), NOT(~) in Spark HQL and SQL Currently there is no support of Bitwise , | in Spark HiveQl and Spark SQL as well. So this PR support the same. I am closing

git commit: [SPARK-3988][SQL] add public API for date type

2014-10-28 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 5807cb40a - 47a40f60d [SPARK-3988][SQL] add public API for date type Add json and python api for date type. By using Pickle, `java.sql.Date` was serialized as calendar, and recognized in python as `datetime.datetime`. Author: Daoyuan

git commit: [Spark 3922] Refactor spark-core to use Utils.UTF_8

2014-10-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 47a40f60d - abcafcfba [Spark 3922] Refactor spark-core to use Utils.UTF_8 A global UTF8 constant is very helpful to handle encoding problems when converting between String and bytes. There are several solutions here: 1. Add `val UTF_8 =

git commit: [SPARK-3343] [SQL] Add serde support for CTAS

2014-10-28 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master abcafcfba - 4b55482ab [SPARK-3343] [SQL] Add serde support for CTAS Currently, `CTAS` (Create Table As Select) doesn't support specifying the `SerDe` in HQL. This PR will pass down the `ASTNode` into the physical operator

[1/2] [SPARK-4084] Reuse sort key in Sorter

2014-10-28 Thread adav
Repository: spark Updated Branches: refs/heads/master 4b55482ab - 84e5da87e http://git-wip-us.apache.org/repos/asf/spark/blob/84e5da87/core/src/test/scala/org/apache/spark/util/collection/SorterSuite.scala -- diff --git

[2/2] git commit: [SPARK-4084] Reuse sort key in Sorter

2014-10-28 Thread adav
[SPARK-4084] Reuse sort key in Sorter Sorter uses generic-typed key for sorting. When data is large, it creates lots of key objects, which is not efficient. We should reuse the key in Sorter for memory efficiency. This change is part of the petabyte sort implementation from rxin . The

git commit: [SPARK-4008] Fix kryo with fold in KryoSerializerSuite

2014-10-28 Thread adav
Repository: spark Updated Branches: refs/heads/master 84e5da87e - 1536d7033 [SPARK-4008] Fix kryo with fold in KryoSerializerSuite `zeroValue` will be serialized by `spark.closure.serializer` but `spark.closure.serializer` only supports the default Java serializer. So it must not be

git commit: [SPARK-3904] [SQL] add constant objectinspector support for udfs

2014-10-28 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 1536d7033 - b5e79bf88 [SPARK-3904] [SQL] add constant objectinspector support for udfs In HQL, we convert all of the data type into normal `ObjectInspector`s for UDFs, most of cases it works, however, some of the UDF actually requires its

git commit: [SPARK-4133] [SQL] [PySpark] type conversionfor python udf

2014-10-28 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master b5e79bf88 - 8c0bfd08f [SPARK-4133] [SQL] [PySpark] type conversionfor python udf Call Python UDF on ArrayType/MapType/PrimitiveType, the returnType can also be ArrayType/MapType/PrimitiveType. For StructType, it will act as tuple