[GitHub] spark issue #14830: [SPARK-16992][PYSPARK][DOCS] import sort and autopep8 on...

2017-06-26 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/14830 Hi, are you still working on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2017-06-26 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/14963 Hi, are you still working on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #17096: [SPARK-15243][ML][SQL][PYTHON] Add missing support for u...

2017-06-26 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17096 @HyukjinKwon @holdenk Hi, are you still working on this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request #16056: [SPARK-18623][SQL] Add `returnNullable` to `Stati...

2017-06-27 Thread ueshin
Github user ueshin closed the pull request at: https://github.com/apache/spark/pull/16056 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16056: [SPARK-18623][SQL] Add `returnNullable` to `StaticInvoke...

2017-06-27 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/16056 I'd close this for now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and w

[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-06-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18444#discussion_r124465362 --- Diff: python/pyspark/sql/types.py --- @@ -958,12 +968,17 @@ def _infer_type(obj): return MapType(_infer_type(key), _infer_type(value

[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-06-28 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18444#discussion_r124464895 --- Diff: python/pyspark/sql/tests.py --- @@ -2259,6 +2261,60 @@ def test_BinaryType_serialization(self): df = self.spark.createDataFrame(data

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-06-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-06-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 Btw, seems like `array` module in Python 2 doesn't have `array.typecodes` attribute. ``` Traceback (most recent call last): File "/home/jenkins/workspace/SparkPullReques

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-06-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 In my local environment, this patch for floatarray looks working well: ```python df = spark.createDataFrame([Row(floatarray=array('f',[1,2,3]), doublearray=array(

[GitHub] spark issue #13873: [SPARK-16167][SQL] RowEncoder should preserve array/map ...

2017-06-28 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/13873 Hmm, I guess we need #16056 to fix nullability of `StaticInvoke`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark issue #16056: [SPARK-18623][SQL] Add `returnNullable` to `StaticInvoke...

2017-06-29 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/16056 Yes, I'll reopen, and update if needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this fe

[GitHub] spark pull request #16056: [SPARK-18623][SQL] Add `returnNullable` to `Stati...

2017-06-29 Thread ueshin
GitHub user ueshin reopened a pull request: https://github.com/apache/spark/pull/16056 [SPARK-18623][SQL] Add `returnNullable` to `StaticInvoke` and modify it to handle properly. ## What changes were proposed in this pull request? Add `returnNullable` to `StaticInvoke` the

[GitHub] spark pull request #17865: [SPARK-20456][Docs] Add examples for functions co...

2017-07-02 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/17865#discussion_r125201828 --- Diff: python/pyspark/sql/functions.py --- @@ -1073,12 +1108,17 @@ def last_day(date): return Column(sc._jvm.functions.last_day(_to_java_column

[GitHub] spark issue #17227: [SPARK-19507][PySpark][SQL] Show field name in _verify_t...

2017-07-02 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17227 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-02 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18484: [SPARK-21264][PYTHON] Call cross join path in joi...

2017-07-02 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18484#discussion_r125212648 --- Diff: python/pyspark/sql/tests.py --- @@ -2021,6 +2021,11 @@ def test_toDF_with_schema_string(self): self.assertEqual(df.schema.simpleString

[GitHub] spark issue #17865: [SPARK-20456][Docs] Add examples for functions collectio...

2017-07-03 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17865 Looks like there is a merge conflict. Can you fix it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #17865: [SPARK-20456][Docs] Add examples for functions collectio...

2017-07-03 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17865 Yes, or you can merge master and fix the conflict. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #18484: [SPARK-21264][PYTHON] Call cross join path in join witho...

2017-07-03 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18484 Thanks! merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-03 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 I guess you can define source code encoding to the top of the file like: ``` # -*- coding: utf-8 -*- ``` --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request #18524: [SPARK-21300][SQL] ExternalMapToCatalyst should n...

2017-07-04 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/18524 [SPARK-21300][SQL] ExternalMapToCatalyst should null-check map key prior to converting to internal value. ## What changes were proposed in this pull request? `ExternalMapToCatalyst` should

[GitHub] spark issue #18524: [SPARK-21300][SQL] ExternalMapToCatalyst should null-che...

2017-07-04 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18524 cc @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #16056: [SPARK-18623][SQL] Add `returnNullable` to `StaticInvoke...

2017-07-04 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/16056 #18524 could fix the test failure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18529: [SPARK-21304][SQL] remove unnecessary isNull vari...

2017-07-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18529#discussion_r125542708 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -588,12 +591,14 @@ case class MapObjects

[GitHub] spark issue #16056: [SPARK-18623][SQL] Add `returnNullable` to `StaticInvoke...

2017-07-04 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/16056 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18529: [SPARK-21304][SQL] remove unnecessary isNull vari...

2017-07-04 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18529#discussion_r12516 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/objects/objects.scala --- @@ -588,12 +591,14 @@ case class MapObjects

[GitHub] spark issue #18529: [SPARK-21304][SQL] remove unnecessary isNull variable fo...

2017-07-04 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18529 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #17865: [SPARK-20456][Docs] Add examples for functions collectio...

2017-07-05 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17865 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #13873: [SPARK-16167][SQL] RowEncoder should preserve array/map ...

2017-07-05 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/13873 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17865: [SPARK-20456][Docs] Add examples for functions collectio...

2017-07-05 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17865 How about https://github.com/apache/spark/pull/17865/files#r123964695 ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-05 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 Hmm, seems like `net.razorvine.pickle.objects.ArrayConstructor` handles `'l'` as `int` instead of `long` in Python 2. https://github.com/irmen/Pyrolite/blob/master/java/src/mai

[GitHub] spark pull request #18553: [SPARK-21327][SQL][PYSPARK] ArrayConstructor shou...

2017-07-06 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/18553 [SPARK-21327][SQL][PYSPARK] ArrayConstructor should handle an array of typecode 'l' as long rather than int in Python 2. ## What changes were proposed in this pull request?

[GitHub] spark issue #18553: [SPARK-21327][SQL][PYSPARK] ArrayConstructor should hand...

2017-07-06 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18553 cc @holdenk, @HyukjinKwon, @zasdfgbnm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-06 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 I believe #18553 will fix the problem. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-06 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18444#discussion_r125855962 --- Diff: python/pyspark/sql/types.py --- @@ -958,12 +1043,17 @@ def _infer_type(obj): return MapType(_infer_type(key), _infer_type

[GitHub] spark issue #18553: [SPARK-21327][SQL][PYSPARK] ArrayConstructor should hand...

2017-07-06 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18553 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18553: [SPARK-21327][SQL][PYSPARK] ArrayConstructor should hand...

2017-07-06 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18553 Thanks for reviewing! merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-06 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-06 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 LGTM, pending Jenkins. @HyukjinKwon, @holdenk, Do you have any other concerns? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17865: [SPARK-20456][Docs] Add examples for functions collectio...

2017-07-07 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17865 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-09 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 @zasdfgbnm Hmm, I agree that this is a bug of `ArrayConstructor`. I guess we should use `'c' -> 19` for [line 60](https://github.com/apache/spark/blob/master/core/src/main/scala/org

[GitHub] spark pull request #18576: [SPARK-21351][SQL] Update nullability based on ch...

2017-07-09 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18576#discussion_r126342208 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/UpdateNullabilityInAttributeReferencesSuite.scala --- @@ -0,0 +1,53

[GitHub] spark issue #18597: [SPARK-20456][PYTHON][FOLLOWUP] Fix timezone-dependent d...

2017-07-10 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18597 How about surrounding with SQLConf `spark.sql.session.timeZone` as `America/Los_Angeles` ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #18597: [SPARK-20456][PYTHON][FOLLOWUP] Fix timezone-dependent d...

2017-07-10 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18597 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #18597: [SPARK-20456][PYTHON][FOLLOWUP] Fix timezone-dependent d...

2017-07-10 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18597 Thanks! merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #17150: [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10

2017-07-12 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/17150 I guess we can also remove thread safety tests at the bottom of `org.apache.spark.sql.catalyst.ScalaReflectionSuite` ([ScalaReflectionSuite.scala#L341-L376](https://github.com/srowen/spark/blob

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-13 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 @HyukjinKwon, @holdenk, Could you double-check the way to handle an array of typecode `'c'` please? I'm not exactly sure this can cover all cases. --- If your project is set up

[GitHub] spark pull request #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConvert...

2017-07-17 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/18655 [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and add DecimalType, ArrayType and StructType support. ## What changes were proposed in this pull request? This is a refactoring of

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-17 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18655 cc @cloud-fan @BryanCutler --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18664: [SPARK-21375][PYSPARK][SQL][WIP] Add Date and Tim...

2017-07-17 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18664#discussion_r127879502 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/arrow/ArrowConvertersSuite.scala --- @@ -792,6 +793,76 @@ class ArrowConvertersSuite

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-17 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18655 Thank you for your comments. I agree that we should split this into smaller PRs. I'll push another commit to remove `ArrowColumnVector` from this as soon as possible. --- If your project i

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-18 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18655 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-18 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18655 @BryanCutler I'd like to share the motivation of refactoring `ArrowConverters` and `ColumnWriter`. For `ColumnWriter`, at first I'd like to support complex types like `Arra

[GitHub] spark pull request #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types ...

2017-07-18 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18444#discussion_r128142573 --- Diff: python/pyspark/sql/types.py --- @@ -938,12 +1016,17 @@ def _infer_type(obj): return MapType(_infer_type(key), _infer_type

[GitHub] spark pull request #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConvert...

2017-07-18 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18655#discussion_r128146856 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/vectorized/ArrowWriter.scala --- @@ -0,0 +1,405 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConvert...

2017-07-18 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18655#discussion_r128146843 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -55,145 +51,55 @@ private[sql] class ArrowPayload

[GitHub] spark pull request #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConvert...

2017-07-18 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18655#discussion_r128146875 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/arrow/ArrowConvertersSuite.scala --- @@ -391,6 +392,85 @@ class ArrowConvertersSuite

[GitHub] spark pull request #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConvert...

2017-07-18 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18655#discussion_r128146851 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -55,145 +51,55 @@ private[sql] class ArrowPayload

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-18 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18655 I see, I'll move files back to `arrow` package. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-18 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 LGTM, too. pending Jenkins. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-18 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 Yea, it's related to typecode `'L'` in pypy environment. We might need to handle it as the same as we did for typecode `'c'` or override `constructLongArrayFromUInt64()`.

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-18 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 Btw, now I'm wondering Python 3 can handle array like `array('L', [9223372036854775807])`. --- If your project is set up for it, you can reply to this email and have your reply appe

[GitHub] spark issue #18655: [SPARK-21440][SQL][PYSPARK] Refactor ArrowConverters and...

2017-07-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18655 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 @HyukjinKwon Oh, I was missing his test for `'L'`. Thanks for testing! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If yo

[GitHub] spark pull request #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as...

2017-07-19 Thread ueshin
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/18680 [SPARK-21472][SQL] Introduce ArrowColumnVector as a reader for Arrow vectors. ## What changes were proposed in this pull request? Introducing `ArrowColumnVector` as a reader for Arrow

[GitHub] spark issue #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as a read...

2017-07-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18680 cc @BryanCutler @kiszk @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as...

2017-07-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18680#discussion_r128217817 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ArrowColumnVector.java --- @@ -0,0 +1,510 @@ +/* + * Licensed to the

[GitHub] spark issue #18444: [SPARK-16542][SQL][PYSPARK] Fix bugs about types that re...

2017-07-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18444 Thanks! merging to master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark pull request #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as...

2017-07-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18680#discussion_r128422124 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ReadOnlyColumnVector.java --- @@ -0,0 +1,250 @@ +/* + * Licensed to the

[GitHub] spark pull request #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as...

2017-07-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18680#discussion_r128425637 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ReadOnlyColumnVector.java --- @@ -0,0 +1,250 @@ +/* + * Licensed to the

[GitHub] spark pull request #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as...

2017-07-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18680#discussion_r128425617 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/vectorized/ArrowColumnVector.java --- @@ -0,0 +1,545 @@ +/* + * Licensed to the

[GitHub] spark pull request #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as...

2017-07-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/18680#discussion_r128425605 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala --- @@ -0,0 +1,109 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #18680: [SPARK-21472][SQL] Introduce ArrowColumnVector as a read...

2017-07-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/18680 @BryanCutler Thank you for reviewing! As for scope, yes, I'd like these APIs to be public. Do you have any concerns about it? --- If your project is set up for it, you can reply to this

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-13 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19607 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-13 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19607 The remaining discussions in this pr are: - [ ] Do we need the config `"spark.sql.execution.pandas.respectSessionTimeZone"`? - @cloud-fan raised that we don't need the

[GitHub] spark issue #19726: [SPARK-22490][DOC] Add PySpark doc for SparkSession.buil...

2017-11-13 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19726 LGTM for now, but I'd like you to add some comments to explain this is a workaround for Sphinx <`1.6.6`. --- - To unsubs

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-14 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19607 I'll ask dev list if we can drop support old Pandas (<0.19.2), or what version we should support if we can't. --- ---

[GitHub] spark pull request #19630: wip: [SPARK-22409] Introduce function type argume...

2017-11-14 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19630#discussion_r151042155 --- Diff: python/pyspark/sql/functions.py --- @@ -2247,16 +2142,20 @@ def pandas_udf(f=None, returnType=StringType()): | 8| JOHN DOE

[GitHub] spark issue #19762: [SPARK-22535][PySpark] Sleep before killing the python w...

2017-11-15 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19762 Thanks! merging to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19630: [SPARK-22409] Introduce function type argument in...

2017-11-15 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19630#discussion_r151331041 --- Diff: python/pyspark/worker.py --- @@ -89,6 +90,26 @@ def verify_result_length(*a): return lambda *a: (verify_result_length(*a

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-11-15 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/17436#discussion_r151333181 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/VectorizedHashMapGenerator.scala --- @@ -75,9 +77,14 @@ class

[GitHub] spark pull request #19630: [SPARK-22409] Introduce function type argument in...

2017-11-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19630#discussion_r151466298 --- Diff: python/pyspark/worker.py --- @@ -89,6 +90,26 @@ def verify_result_length(*a): return lambda *a: (verify_result_length(*a

[GitHub] spark pull request #19769: [SPARK-12297][SQL] Adjust timezone for int96 data...

2017-11-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19769#discussion_r151597861 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetReadSupport.scala --- @@ -30,6 +30,7 @@ import

[GitHub] spark pull request #19769: [SPARK-12297][SQL] Adjust timezone for int96 data...

2017-11-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19769#discussion_r151597632 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala --- @@ -337,6 +341,8 @@ class

[GitHub] spark pull request #19769: [SPARK-12297][SQL] Adjust timezone for int96 data...

2017-11-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19769#discussion_r151597335 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedParquetRecordReader.java --- @@ -105,13 +108,23

[GitHub] spark pull request #19769: [SPARK-12297][SQL] Adjust timezone for int96 data...

2017-11-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19769#discussion_r151596892 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java --- @@ -93,13 +94,18 @@ private

[GitHub] spark pull request #19769: [SPARK-12297][SQL] Adjust timezone for int96 data...

2017-11-16 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19769#discussion_r151598171 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala --- @@ -87,4 +96,113 @@ class

[GitHub] spark issue #19782: [SPARK-22554][PYTHON] Add a config to control if PySpark...

2017-11-19 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19782 LGTM. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19769: [SPARK-12297][SQL] Adjust timezone for int96 data...

2017-11-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19769#discussion_r151908974 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetInteroperabilitySuite.scala --- @@ -87,4 +95,109 @@ class

[GitHub] spark pull request #19769: [SPARK-12297][SQL] Adjust timezone for int96 data...

2017-11-19 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19769#discussion_r151909614 --- Diff: sql/core/src/main/java/org/apache/spark/sql/execution/datasources/parquet/VectorizedColumnReader.java --- @@ -298,7 +304,10 @@ private void

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-20 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19607 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior ...

2017-11-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19607#discussion_r152174813 --- Diff: python/pyspark/sql/dataframe.py --- @@ -1913,7 +1920,16 @@ def toPandas(self): for f, t in dtype.items

[GitHub] spark pull request #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior ...

2017-11-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19607#discussion_r152174935 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ArrowPythonRunner.scala --- @@ -58,6 +59,11 @@ class ArrowPythonRunner

[GitHub] spark pull request #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior ...

2017-11-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19607#discussion_r152174929 --- Diff: python/pyspark/worker.py --- @@ -150,7 +150,8 @@ def read_udfs(pickleSer, infile, eval_type): if eval_type

[GitHub] spark issue #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior of time...

2017-11-20 Thread ueshin
Github user ueshin commented on the issue: https://github.com/apache/spark/pull/19607 Well, I'd leave the config `spark.sql.execution.pandas.respectSessionTimeZone` as it is for now to be safe as @gatorsmile mentioned before. And as we discussed in dev list, I'll upda

[GitHub] spark pull request #17436: [SPARK-20101][SQL] Use OffHeapColumnVector when "...

2017-11-20 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/17436#discussion_r152189670 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -140,6 +140,13 @@ object SQLConf { .booleanConf

[GitHub] spark pull request #19607: [WIP][SPARK-22395][SQL][PYTHON] Fix the behavior ...

2017-11-21 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19607#discussion_r152201452 --- Diff: python/pyspark/sql/types.py --- @@ -1678,37 +1678,105 @@ def from_arrow_schema(arrow_schema): for field in arrow_schema

[GitHub] spark pull request #19468: [SPARK-18278] [Scheduler] Spark on Kubernetes - B...

2017-11-21 Thread ueshin
Github user ueshin commented on a diff in the pull request: https://github.com/apache/spark/pull/19468#discussion_r152217994 --- Diff: resource-managers/kubernetes/core/src/test/scala/org/apache/spark/scheduler/cluster/k8s/KubernetesClusterSchedulerBackendSuite.scala --- @@ -0,0

<    3   4   5   6   7   8   9   10   11   12   >