[GitHub] spark issue #10527: [SPARK-12567][SQL] Add aes_{encrypt,decrypt} UDFs

2018-02-12 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/10527 @vectorijk Is this PR dead ? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #19234: [SPARK-22010][PySpark] Change fromInternal method of Tim...

2017-11-07 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19234 I think that porting changes from Python 3.6 give us too complicated code. I'm closing it. --- - To unsubscribe, e-mail

[GitHub] spark pull request #19234: [SPARK-22010][PySpark] Change fromInternal method...

2017-11-06 Thread maver1ck
Github user maver1ck closed the pull request at: https://github.com/apache/spark/pull/19234 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19255: [SPARK-22029][PySpark] Add lru_cache to _parse_datatype_...

2017-11-06 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19255 OK. Let's close it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #19255: [SPARK-22029][PySpark] Add lru_cache to _parse_da...

2017-11-06 Thread maver1ck
Github user maver1ck closed the pull request at: https://github.com/apache/spark/pull/19255 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19255: [SPARK-22029][PySpark] Add lru_cache to _parse_datatype_...

2017-10-27 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19255 Ping @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19566: [SPARK-22341][yarn] Impersonate correct user when prepar...

2017-10-25 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19566 @vanzin I tested your patch. It worked. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19255: [SPARK-22029][PySpark] Add lru_cache to _parse_da...

2017-10-23 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19255#discussion_r146248695 --- Diff: python/pyspark/sql/types.py --- @@ -24,6 +24,7 @@ import re import base64 from array import array +from functools import

[GitHub] spark pull request #18685: [SPARK-21439][PySpark] Support for ABCMeta in PyS...

2017-10-23 Thread maver1ck
Github user maver1ck closed the pull request at: https://github.com/apache/spark/pull/18685 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19255: [SPARK-22029][PySpark] Add lru_cache to _parse_datatype_...

2017-10-23 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19255 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #18685: [SPARK-21439][PySpark] Support for ABCMeta in PySpark

2017-10-23 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/18685 I realized that changes here was added also in SPARK-21070. https://github.com/apache/spark/commit/751f513367ae776c6d6815e1ce138078924872eb So we can close this PR

[GitHub] spark issue #19234: [SPARK-22010][PySpark] Change fromInternal method of Tim...

2017-09-21 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19234 It was introduced with this PEP. https://www.python.org/dev/peps/pep-0495/ --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19255: [WIP][SPARK-22029][PySpark] Add lru_cache to _parse_data...

2017-09-20 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19255 @HyukjinKwon I added perf tests. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19246: [SPARK-22025][PySpark] Speeding up fromInternal for Stru...

2017-09-20 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19246 @HyukjinKwon I created this before https://github.com/apache/spark/pull/19249, which greatly decrease function call. I agree we can close

[GitHub] spark pull request #19246: [SPARK-22025][PySpark] Speeding up fromInternal f...

2017-09-20 Thread maver1ck
Github user maver1ck closed the pull request at: https://github.com/apache/spark/pull/19246 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #19234: [SPARK-22010][PySpark] Change fromInternal method of Tim...

2017-09-18 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19234 OK. It passed all tests, so let merge it --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19234: [WIP][SPARK-22010][PySpark] Change fromInternal method o...

2017-09-18 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19234 I check with some samples and code with float can trigger errors. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19249: [SPARK-22032][PySpark] Speed up StructType conversion

2017-09-18 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19249 @ueshin I think that for Maptype this is not a solution because every key / value of MapType is the same type so we need conversion for all entries or for nothing

[GitHub] spark issue #19234: [SPARK-22010][PySpark] Change fromInternal method of Tim...

2017-09-18 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19234 I'm asking because such a code is 2x faster than my solution --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19234: [SPARK-22010][PySpark] Change fromInternal method of Tim...

2017-09-18 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19234 Any idea why we're not using `datetime.datetime.fromtimestamp(ts / 10.)` ? There is a comment about overflow. But if it exists

[GitHub] spark issue #19260: [SPARK-22043][PYTHON] Improves error message for show_pr...

2017-09-17 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19260 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #19249: [SPARK-22032][PySpark] Speed up StructType conversion

2017-09-17 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19249 Done. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #19249: [SPARK-22032][PySpark] Speed up StructType.fromIn...

2017-09-17 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19249#discussion_r139306166 --- Diff: python/pyspark/sql/types.py --- @@ -619,7 +621,8 @@ def fromInternal(self, obj): # it's already converted by pickler

[GitHub] spark pull request #19255: [WIP][SPARK-22029][PySpark] Add lru_cache to _par...

2017-09-17 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19255#discussion_r139305244 --- Diff: python/pyspark/sql/types.py --- @@ -24,6 +24,7 @@ import re import base64 from array import array +from functools import

[GitHub] spark pull request #19249: [SPARK-22032][PySpark] Speed up StructType.fromIn...

2017-09-17 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19249#discussion_r139303509 --- Diff: python/pyspark/sql/types.py --- @@ -619,7 +621,8 @@ def fromInternal(self, obj): # it's already converted by pickler

[GitHub] spark pull request #19255: [WIP][SPARK-22029] Add lru_cache to _parse_dataty...

2017-09-16 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19255#discussion_r139292791 --- Diff: python/pyspark/sql/types.py --- @@ -24,6 +24,7 @@ import re import base64 from array import array +from functools import

[GitHub] spark pull request #19255: [WIP][SPARK-22029] Add lru_cache to _parse_dataty...

2017-09-16 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/19255 [WIP][SPARK-22029] Add lru_cache to _parse_datatype_json_string ## What changes were proposed in this pull request? _parse_datatype_json_string is called many times for the same datatypes

[GitHub] spark pull request #19246: [SPARK-22025] Speeding up fromInternal for Struct...

2017-09-16 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19246#discussion_r139291502 --- Diff: python/pyspark/sql/types.py --- @@ -410,6 +410,24 @@ def __init__(self, name, dataType, nullable=True, metadata=None

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19249 I added benchmark for this code --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19234: [SPARK-22010] Change fromInternal method of Times...

2017-09-16 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19234#discussion_r139290042 --- Diff: python/pyspark/sql/types.py --- @@ -196,7 +199,9 @@ def toInternal(self, dt): def fromInternal(self, ts): if ts is not None

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19249 Yep. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #18685: [SPARK-21439] Support for ABCMeta in PySpark

2017-09-16 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/18685 Ping received. I'll try to add tests and resolve conflict --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-16 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19249 I was checking this with my production code. This give me about 6-7% of speed up and remove 408 millions of function calls :) I'll try to create benchmark

[GitHub] spark issue #19246: [SPARK-22025] Speeding up fromInternal for StructField

2017-09-16 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/19246 @dongjoon-hyun I'll do it on Monday. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19234: [SPARK-22010] Change fromInternal method of Times...

2017-09-16 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/19234#discussion_r139284824 --- Diff: python/pyspark/sql/types.py --- @@ -196,7 +199,9 @@ def toInternal(self, dt): def fromInternal(self, ts): if ts is not None

[GitHub] spark pull request #19249: [SPARK-22032] Speed up StructType.fromInternal

2017-09-15 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/19249 [SPARK-22032] Speed up StructType.fromInternal ## What changes were proposed in this pull request? StructType.fromInternal is calling f.fromInternal(v) for every field. We can use

[GitHub] spark pull request #19246: [SPARK-22025] Speeding up fromInternal for Struct...

2017-09-15 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/19246 [SPARK-22025] Speeding up fromInternal for StructField ## What changes were proposed in this pull request? Change function call to references can greatly speed up function calling

[GitHub] spark pull request #19234: [SPARK-22010] Change fromInternal method of Times...

2017-09-14 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/19234 [SPARK-22010] Change fromInternal method of TimestampType ## What changes were proposed in this pull request? This PR changes the way pySpark converts Timestamp format from internal

[GitHub] spark pull request #18685: Add Weakref to cloudpickle

2017-07-19 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/18685 Add Weakref to cloudpickle https://github.com/cloudpipe/cloudpickle/pull/104/files ## What changes were proposed in this pull request? Possibility to use ABCMeta with Spark

[GitHub] spark issue #17722: [SPARK-12717][PYSPARK][BRANCH-1.6] Resolving race condit...

2017-07-19 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/17722 Hi, What about this issue ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #18515: [SPARK-21287] Ability to use Integer.MIN_VALUE as...

2017-07-03 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/18515 [SPARK-21287] Ability to use Integer.MIN_VALUE as a fetchSize ## What changes were proposed in this pull request? FIX for https://issues.apache.org/jira/browse/SPARK-21287 ## How

[GitHub] spark issue #17694: [SPARK-12717][PYSPARK] Resolving race condition with pys...

2017-04-21 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/17694 @vundela Great. But I'm planning to migrate to 2.1 as soon as 2.1.1 will be released. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #17694: [SPARK-12717][PYSPARK] Resolving race condition with pys...

2017-04-21 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/17694 OK. I did additional tests. Fix is working only with Spark 2.1. I tried to apply it on 2.0.2 and that was the reason of my problem. --- If your project is set up for it, you can reply

[GitHub] spark issue #17694: [SPARK-12717][PYSPARK] Resolving race condition with pys...

2017-04-21 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/17694 I checked pyspark.zip of running container and everything is on its place. So I assume that there is more that one race condition in this code. I'll try to prepare example

[GitHub] spark issue #17694: [SPARK-12717][PYSPARK] Resolving race condition with pys...

2017-04-20 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/17694 The funny thing is this code works for me on 4 threads and throws exception on 10 threads --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #17694: [SPARK-12717][PYSPARK] Resolving race condition with pys...

2017-04-20 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/17694 I tested your patch in our environment. Problem still exists. ``` Job aborted due to stage failure: Task 0 in stage 22.0 failed 8 times, most recent failure: Lost task 0.7 in stage

[GitHub] spark issue #17328: [SPARK-19975][Python][SQL] Add map_keys and map_values f...

2017-03-20 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/17328 Looks good :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15599: [SPARK-18022][SQL] java.lang.NullPointerException instea...

2016-10-22 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/15599 I can try this fix on Monday. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15106: [SPARK-16439] [SQL] bring back the separator in SQL UI

2016-09-19 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/15106 LGTM. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #15106: [SPARK-16439] [SQL] bring back the separator in SQL UI

2016-09-15 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/15106 I think this patch could actually work. Number format is executed on the server side. I did some tests and it looks good. --- If your project is set up for it, you can reply

[GitHub] spark issue #14340: [SPARK-16534][Streaming][Kafka] Add Python API support f...

2016-09-14 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14340 @rxin, Production streaming jobs can be written in Python and me and my company are example. I wrote a little bit more in Jira. I think it's better place for discussion. https

[GitHub] spark issue #14388: [SPARK-16362][SQL] Support ArrayType and StructType in v...

2016-08-12 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14388 @viirya I will after the weekend. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14465: [SPARK-16321] Fixing performance regression when reading...

2016-08-04 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14465 OK. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14465: [SPARK-16321] Fixing performance regression when ...

2016-08-04 Thread maver1ck
Github user maver1ck closed the pull request at: https://github.com/apache/spark/pull/14465 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request #14390: [SPARK-15541] Casting ConcurrentHashMap to Concur...

2016-08-04 Thread maver1ck
GitHub user maver1ck reopened a pull request: https://github.com/apache/spark/pull/14390 [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap (branch-1.6) ## What changes were proposed in this pull request? Casting ConcurrentHashMap to ConcurrentMap allows to run code

[GitHub] spark issue #14390: [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap...

2016-08-04 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14390 @srowen I missed one change in Catalog.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #14388: [SPARK-16362][SQL] Support ArrayType and StructType in v...

2016-08-03 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14388 @viirya I tried to test your patch on my production workflow. Getting: ``` Py4JJavaError: An error occurred while calling o56.count. : org.apache.spark.SparkException: Job

[GitHub] spark issue #14445: [SPARK-16320] [SQL] Fix performance regression for parqu...

2016-08-03 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14445 @rxin I added some comments to Jira. I think both problems has solutions right now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #13701: [SPARK-15639][SQL] Try to push down filter at Row...

2016-08-03 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/13701#discussion_r73304180 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -527,4 +536,43 @@ class

[GitHub] spark pull request #13701: [SPARK-15639][SQL] Try to push down filter at Row...

2016-08-03 Thread maver1ck
Github user maver1ck commented on a diff in the pull request: https://github.com/apache/spark/pull/13701#discussion_r73290562 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilterSuite.scala --- @@ -527,4 +536,43 @@ class

[GitHub] spark pull request #14390: [SPARK-15541] Casting ConcurrentHashMap to Concur...

2016-08-02 Thread maver1ck
Github user maver1ck closed the pull request at: https://github.com/apache/spark/pull/14390 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #14390: [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap...

2016-08-02 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14390 Done. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #14465: [SPARK-16320][SPARK-16321] Fixing performance regression...

2016-08-02 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14465 @davies No problem. I just want to isolate the reason of performance regression. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request #14465: [SPARK-16320][SPARK-16321] Fixing performance reg...

2016-08-02 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/14465 [SPARK-16320][SPARK-16321] Fixing performance regression when reading… ## What changes were proposed in this pull request? This PR add correct support for PPD when using non-vectorized

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-08-02 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/13701 @gatorsmile I added comment in Jira. "spark.sql.parquet.filterPushdown has true as a default. Vectorized Reader isn't a case here because I have nested columns (and Vectorized R

[GitHub] spark issue #13701: [SPARK-15639][SQL] Try to push down filter at RowGroups ...

2016-08-02 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/13701 I think that this PR also resolves my problem here. https://issues.apache.org/jira/browse/SPARK-16321?focusedCommentId=15383785=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel

[GitHub] spark issue #14390: [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap...

2016-08-02 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14390 As you merged https://github.com/apache/spark/pull/14459 I removed changes in Dispatcher.scala. --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14445: [SPARK-16320] [SQL] Fix performance regression for parqu...

2016-08-02 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14445 @rxin I tested this patch. The result are almost equal to Spark without this patch. (difference is less than 5%) So maybe it's needed but it doesn't solve my problem. --- If your

[GitHub] spark issue #14390: [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap...

2016-08-02 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14390 I know that. And this patch is quite different. On master there are changes only in Dispatcher.scala. On branch-1.6 we need changes also in Catalog.scala. --- If your project is set up

[GitHub] spark issue #14390: [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap...

2016-08-02 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14390 Could you tell me why ? We need different PR against different branches. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark issue #14390: [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap

2016-08-02 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14390 I added another PR vs master. Using following command to find suspicious code. ``` for i in `grep -c -R ConcurrentHashMap | grep -v ':0' | sed -e s/:.*//`; do echo $i; grep keySet $i

[GitHub] spark pull request #14459: [SPARK-15541] Casting ConcurrentHashMap to Concur...

2016-08-02 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/14459 [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap ## What changes were proposed in this pull request? Casting ConcurrentHashMap to ConcurrentMap allows to run code compiled

[GitHub] spark issue #14445: [SPARK-16320] [SQL] Fix performance regression for parqu...

2016-08-02 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14445 @rxin I'll test this patch tomorrow. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #10909: [SPARK-10086] [MLlib] [Streaming] [PySpark] ignore Strea...

2016-07-28 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/10909 @jkbradley What about merging this to branch-1.6 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #14390: [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap

2016-07-28 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14390 @jkbradley Could you look at it ? I think this is problem from: https://issues.apache.org/jira/browse/SPARK-10086 Maybe we should merge this PR to branch-1.6 before testing ? https

[GitHub] spark pull request #14390: [SPARK-15541] Casting ConcurrentHashMap to Concur...

2016-07-28 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/14390 [SPARK-15541] Casting ConcurrentHashMap to ConcurrentMap ## What changes were proposed in this pull request? Casting ConcurrentHashMap to ConcurrentMap allows to run code compiled

[GitHub] spark issue #11445: [SPARK-13594][SQL] remove typed operations(e.g. map, fla...

2016-07-19 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/11445 @rxin As we're not planning to implement DataSets in Python is there a plan to revert this Jira ? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14142: [SPARK-16439] Fix number formatting in SQL UI

2016-07-13 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14142 Merging ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #14142: [SPARK-16439] Fix number formatting in SQL UI

2016-07-12 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14142 Can we test this ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark issue #12913: [SPARK-928][CORE] Add support for Unsafe-based serialize...

2016-07-12 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/12913 Up ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so

[GitHub] spark pull request #14142: [SPARK-16439] Fix number formatting in SQL UI

2016-07-11 Thread maver1ck
GitHub user maver1ck opened a pull request: https://github.com/apache/spark/pull/14142 [SPARK-16439] Fix number formatting in SQL UI ## What changes were proposed in this pull request? Spark SQL UI display numbers greater than 1000 with u00A0 as grouping separator

[GitHub] spark issue #14054: [SPARK-16226] [SQL] Weaken JDBC isolation level to avoid...

2016-07-05 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/14054 @srowen Maybe we can add this as a configuration option ? I'm not sure how this affects performance. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #13925: [SPARK-16226][SQL]change the way of JDBC commit

2016-07-01 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/13925 @srowen Maybe we should change this condition to `conn.getMetaData().supportsTransactions()` ? I can prepare PR. --- If your project is set up for it, you can reply to this email

[GitHub] spark issue #13925: [SPARK-16226][SQL]change the way of JDBC commit

2016-06-30 Thread maver1ck
Github user maver1ck commented on the issue: https://github.com/apache/spark/pull/13925 @srowen Recently I modified MySQL JDBC driver because both supportsDataManipulationTransactionsOnly() and supportsDataDefinitionAndDataManipulationTransactions() return false. So

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

2016-05-12 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/12874#issuecomment-218849616 There is one more thing. We observed that collect_list doesn't work in Spark 2.0 https://issues.apache.org/jira/browse/SPARK-15293 --- If your project is set

[GitHub] spark pull request: [SPARK-10605][SQL] Create native collect_list/...

2016-05-12 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/12874#issuecomment-218721890 Hi, What about this patch ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-12200][SQL] Add __contains__ implementa...

2016-05-11 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10194#issuecomment-218553241 @holdenk Thanks :) @davies I think everything is OK. Can we merge it also into 2.0 branch ? --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-12200][SQL] Add __contains__ implementa...

2016-05-11 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10194#issuecomment-218547285 @davies I fixed whitespaces. Can we test this one more time ? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-12200][SQL] Add __contains__ implementa...

2016-05-11 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10194#issuecomment-218384288 @davies But you mentioned current behaviour. My patch is to change it, so you could access the column by `row['col_name']` and `'col_name' in row

[GitHub] spark pull request: SPARK-13335 Use declarative aggregate for coll...

2016-04-29 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/11688#issuecomment-215676018 Hi, What about this PR ? Will be merged into Spark 2.0 ? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: SPARK-12200 Add __contains__ implementation to...

2016-01-19 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10194#issuecomment-172985889 @holdenk , @davies Can anyone verify this patch ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-12504][SQL] Masking credentials in the ...

2016-01-06 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10452#issuecomment-169409283 For me this is critical security issue. So I'd like to have it in 1.6 branch (I'm sure that 1.6.1 will be available earlier than 2.0.0) --- If your project

[GitHub] spark pull request: [SPARK-12504][SQL] Masking credentials in the ...

2016-01-06 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10452#issuecomment-169420636 It's not the explain but SQL Tab on Spark web console. As far as I understand information there are taken from the same source. Am I right ? PS. I'm

[GitHub] spark pull request: [SPARK-12504][SQL] Masking credentials in the ...

2016-01-06 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10452#issuecomment-169395482 @marmbrus What about merging it to 1.6 branch ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: SPARK-12200 Add __contains__ implementation to...

2015-12-29 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10194#issuecomment-167743179 @holdenk Do you need something more ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5095][MESOS] Support capping cores and ...

2015-12-25 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/4027#issuecomment-167260638 I agree. In YARN mode we have configuration per node ``` YARN: The --num-executors option to the Spark YARN client controls how many executors

[GitHub] spark pull request: SPARK-12200 Add __contains__ implementation to...

2015-12-24 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10194#issuecomment-167114248 @holdenk Is it OK to merge this patch ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-4226][SQL]Add subquery (not) in/exists ...

2015-12-15 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/9055#issuecomment-164912029 So what next ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: SPARK-12200 Add __contains__ implementation to...

2015-12-12 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10194#issuecomment-164124983 Done. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: SPARK-12200 Add __contains__ implementation to...

2015-12-09 Thread maver1ck
Github user maver1ck commented on the pull request: https://github.com/apache/spark/pull/10194#issuecomment-163521876 OK. I will add few words to documentation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

  1   2   >