[jira] [Commented] (SPARK-2141) Add sc.getPersistentRDDs() to PySpark

2015-06-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593802#comment-14593802 ] Ruslan Dautkhanov commented on SPARK-2141: -- Would be gread to have this

[jira] [Commented] (SPARK-11150) Dynamic partition pruning

2015-10-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14963849#comment-14963849 ] Ruslan Dautkhanov commented on SPARK-11150: --- Will partition-wise join will also be handled by

[jira] [Commented] (SPARK-10935) Avito Context Ad Clicks

2016-02-06 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136053#comment-15136053 ] Ruslan Dautkhanov commented on SPARK-10935: --- I noticed outer joins. Spark before 1.5 used

[jira] [Comment Edited] (SPARK-10935) Avito Context Ad Clicks

2016-02-06 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136053#comment-15136053 ] Ruslan Dautkhanov edited comment on SPARK-10935 at 2/6/16 11:16 PM: I

[jira] [Commented] (SPARK-11111) Fast null-safe join

2016-01-31 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15125659#comment-15125659 ] Ruslan Dautkhanov commented on SPARK-1: --- Does this affect all OUTER JOINS? I have poor

[jira] [Commented] (SPARK-13335) Optimize Data Frames collect_list and collect_set with declarative aggregates

2016-03-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15192588#comment-15192588 ] Ruslan Dautkhanov commented on SPARK-13335: --- It would be great to have this optimization in. In

[jira] [Created] (SPARK-13849) REGEX Column Specification

2016-03-13 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-13849: - Summary: REGEX Column Specification Key: SPARK-13849 URL: https://issues.apache.org/jira/browse/SPARK-13849 Project: Spark Issue Type: Wish

[jira] [Commented] (SPARK-12139) REGEX Column Specification for Hive Queries

2016-03-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193636#comment-15193636 ] Ruslan Dautkhanov commented on SPARK-12139: --- Please check if

[jira] [Commented] (SPARK-13849) REGEX Column Specification

2016-03-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15193632#comment-15193632 ] Ruslan Dautkhanov commented on SPARK-13849: --- Thank you, Sean. I suspect this is a duplicate of

[jira] [Created] (SPARK-14017) dataframe.dtypes -> pyspark.sql.types aliases

2016-03-18 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-14017: - Summary: dataframe.dtypes -> pyspark.sql.types aliases Key: SPARK-14017 URL: https://issues.apache.org/jira/browse/SPARK-14017 Project: Spark

[jira] [Commented] (SPARK-13866) Handle decimal type in CSV inference

2016-03-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15197595#comment-15197595 ] Ruslan Dautkhanov commented on SPARK-13866: --- Would be great to have this fix in. It makes

[jira] [Created] (SPARK-14166) Add deterministic sampling like in Hive

2016-03-25 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-14166: - Summary: Add deterministic sampling like in Hive Key: SPARK-14166 URL: https://issues.apache.org/jira/browse/SPARK-14166 Project: Spark Issue

[jira] [Commented] (SPARK-13263) SQL generation support for tablesample

2016-03-24 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15211047#comment-15211047 ] Ruslan Dautkhanov commented on SPARK-13263: --- Would Spark support deterministic sampling too?

[jira] [Created] (SPARK-15302) Implement FK/PK "rely novalidate" constraints for better CBO

2016-05-12 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-15302: - Summary: Implement FK/PK "rely novalidate" constraints for better CBO Key: SPARK-15302 URL: https://issues.apache.org/jira/browse/SPARK-15302 Project:

[jira] [Updated] (SPARK-15302) Implement FK/PK "rely novalidate" constraints for better CBO

2016-05-12 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-15302: -- Description: Oracle has "RELY NOVALIDATE" option for constraints.. Could be easier for

[jira] [Commented] (SPARK-15302) Implement FK/PK "rely novalidate" constraints for better CBO

2016-05-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283194#comment-15283194 ] Ruslan Dautkhanov commented on SPARK-15302: --- Yes, it's a feature request for Spark. See for

[jira] [Comment Edited] (SPARK-15302) Implement FK/PK "rely novalidate" constraints for better CBO

2016-05-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283194#comment-15283194 ] Ruslan Dautkhanov edited comment on SPARK-15302 at 5/13/16 9:45 PM:

[jira] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2017-01-30 Thread Ruslan Dautkhanov (JIRA)
Title: Message Title Ruslan Dautkhanov commented on SPARK-18105

[jira] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2017-01-30 Thread Ruslan Dautkhanov (JIRA)
Title: Message Title Ruslan Dautkhanov commented on SPARK-18105

[jira] (SPARK-18105) LZ4 failed to decompress a stream of shuffled data

2017-01-29 Thread Ruslan Dautkhanov (JIRA)
Title: Message Title Ruslan Dautkhanov commented on SPARK-18105

[jira] [Commented] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-02-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864968#comment-15864968 ] Ruslan Dautkhanov commented on SPARK-19038: --- [~jerryshao] PR 16482 is for a different issue

[jira] [Updated] (SPARK-19588) Allow putting keytab file to HDFS location specified in spark.yarn.keytab

2017-02-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-19588: -- Summary: Allow putting keytab file to HDFS location specified in spark.yarn.keytab

[jira] [Commented] (SPARK-16026) Cost-based Optimizer framework

2017-02-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864992#comment-15864992 ] Ruslan Dautkhanov commented on SPARK-16026: --- [~ioana-delaney], (y) > Cost-based Optimizer

[jira] [Created] (SPARK-19588) Allow putting keytab files specified by

2017-02-13 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-19588: - Summary: Allow putting keytab files specified by Key: SPARK-19588 URL: https://issues.apache.org/jira/browse/SPARK-19588 Project: Spark Issue

[jira] [Commented] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-02-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864995#comment-15864995 ] Ruslan Dautkhanov commented on SPARK-19038: --- Thank you [~jerryshao] > Can't find keytab file

[jira] [Comment Edited] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-02-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865013#comment-15865013 ] Ruslan Dautkhanov edited comment on SPARK-19038 at 2/14/17 4:12 AM:

[jira] [Commented] (SPARK-19588) Allow putting keytab file to HDFS location specified in spark.yarn.keytab

2017-02-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865020#comment-15865020 ] Ruslan Dautkhanov commented on SPARK-19588: --- Our corporate sshd's are integrated with Active

[jira] [Commented] (SPARK-19038) Can't find keytab file when using Hive catalog

2017-02-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865013#comment-15865013 ] Ruslan Dautkhanov commented on SPARK-19038: --- Another possible workaround is to pass principal

[jira] [Commented] (SPARK-19588) Allow putting keytab file to HDFS location specified in spark.yarn.keytab

2017-02-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866839#comment-15866839 ] Ruslan Dautkhanov commented on SPARK-19588: --- driver/yarn#client holds keytab just to distribute

[jira] [Commented] (SPARK-19588) Allow putting keytab file to HDFS location specified in spark.yarn.keytab

2017-02-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867000#comment-15867000 ] Ruslan Dautkhanov commented on SPARK-19588: --- Got it. Thanks [~vanzin] > Allow putting keytab

[jira] [Commented] (SPARK-16026) Cost-based Optimizer framework

2017-02-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15863243#comment-15863243 ] Ruslan Dautkhanov commented on SPARK-16026: --- It would be great to resolve SPARK-15302 as a

[jira] [Comment Edited] (SPARK-17076) Cardinality estimation of join operator

2016-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665644#comment-15665644 ] Ruslan Dautkhanov edited comment on SPARK-17076 at 11/15/16 1:19 AM: -

[jira] [Commented] (SPARK-17076) Cardinality estimation of join operator

2016-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15665644#comment-15665644 ] Ruslan Dautkhanov commented on SPARK-17076: --- HIVE-13076 added FK constraint to Hive / HMS.

[jira] [Commented] (SPARK-5493) Support proxy users under kerberos

2017-01-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15815352#comment-15815352 ] Ruslan Dautkhanov commented on SPARK-5493: -- Thanks again [~vanzin] for the feedback, please check

[jira] [Created] (SPARK-19143) API in Spark for distributing new delegation tokens (Improve delegation token handling in secure clusters)

2017-01-09 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-19143: - Summary: API in Spark for distributing new delegation tokens (Improve delegation token handling in secure clusters) Key: SPARK-19143 URL:

[jira] [Commented] (SPARK-8007) Support resolving virtual columns in DataFrames

2016-12-03 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15718615#comment-15718615 ] Ruslan Dautkhanov commented on SPARK-8007: -- Is spark__partition__id available in PySpark too?

[jira] [Comment Edited] (SPARK-8007) Support resolving virtual columns in DataFrames

2016-12-03 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15718615#comment-15718615 ] Ruslan Dautkhanov edited comment on SPARK-8007 at 12/3/16 7:34 PM: --- Is

[jira] [Commented] (SPARK-5493) Support proxy users under kerberos

2017-01-05 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15802154#comment-15802154 ] Ruslan Dautkhanov commented on SPARK-5493: -- Thank you [~vanzin]! I guess a forwardable and/or

[jira] [Comment Edited] (SPARK-5493) Support proxy users under kerberos

2017-01-06 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15805198#comment-15805198 ] Ruslan Dautkhanov edited comment on SPARK-5493 at 1/6/17 6:19 PM: --

[jira] [Commented] (SPARK-5493) Support proxy users under kerberos

2017-01-06 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15805198#comment-15805198 ] Ruslan Dautkhanov commented on SPARK-5493: -- {quote}There might be ways to hack support for that

[jira] [Commented] (SPARK-5158) Allow for keytab-based HDFS security in Standalone mode

2017-01-04 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800298#comment-15800298 ] Ruslan Dautkhanov commented on SPARK-5158: -- I think one reason for that could be that one user

[jira] [Commented] (SPARK-5493) Support proxy users under kerberos

2017-01-04 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15800304#comment-15800304 ] Ruslan Dautkhanov commented on SPARK-5493: -- Did you figure this out? Is this possible to use

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2016-12-16 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15755560#comment-15755560 ] Ruslan Dautkhanov commented on SPARK-12837: --- Yep, we continue to see this issue in Spark 2.. >

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963510#comment-15963510 ] Ruslan Dautkhanov commented on SPARK-12837: --- It might be a bug in broadcast join. Following

[jira] [Comment Edited] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963510#comment-15963510 ] Ruslan Dautkhanov edited comment on SPARK-12837 at 4/10/17 9:29 PM:

[jira] [Commented] (SPARK-12837) Spark driver requires large memory space for serialized results even there are no data collected to the driver

2017-04-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963812#comment-15963812 ] Ruslan Dautkhanov commented on SPARK-12837: --- [~cloud_fan] I didn't realize torrent broadcast

[jira] [Comment Edited] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-07-28 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105127#comment-16105127 ] Ruslan Dautkhanov edited comment on SPARK-21274 at 7/28/17 3:46 PM:

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-07-28 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105127#comment-16105127 ] Ruslan Dautkhanov commented on SPARK-21274: --- [~viirya], yes it returns {noformat}[1, 2,

[jira] [Comment Edited] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-07-28 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16105127#comment-16105127 ] Ruslan Dautkhanov edited comment on SPARK-21274 at 7/28/17 3:47 PM:

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-07-31 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16107749#comment-16107749 ] Ruslan Dautkhanov commented on SPARK-21274: --- [~viirya], you're right. I've checked now on both

[jira] [Comment Edited] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-15 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127511#comment-16127511 ] Ruslan Dautkhanov edited comment on SPARK-21657 at 8/15/17 4:59 PM:

[jira] [Commented] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-15 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127511#comment-16127511 ] Ruslan Dautkhanov commented on SPARK-21657: --- Thank you [~maropu] and [~viirya], that commit is

[jira] [Commented] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16121801#comment-16121801 ] Ruslan Dautkhanov commented on SPARK-21657: --- [~bjornjons] confirms this problem pertains to

[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21657: -- Description: It can take up to half a day to explode a modest-sized nested collection

[jira] [Issue Comment Deleted] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-07-11 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21274: -- Comment: was deleted (was: [~rxin], I wish I could. We only use PySpark and SQL API to

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-07-11 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16082900#comment-16082900 ] Ruslan Dautkhanov commented on SPARK-13534: --- So Apache Arrow would currently be available only

[jira] [Commented] (SPARK-15703) Make ListenerBus event queue size configurable

2017-07-17 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091031#comment-16091031 ] Ruslan Dautkhanov commented on SPARK-15703: --- It breaks spark dynamic allocation too - Spark

[jira] [Created] (SPARK-21460) Spark dynamic allocation breaks when ListenerBus event queue runs full

2017-07-18 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-21460: - Summary: Spark dynamic allocation breaks when ListenerBus event queue runs full Key: SPARK-21460 URL: https://issues.apache.org/jira/browse/SPARK-21460

[jira] [Commented] (SPARK-15703) Make ListenerBus event queue size configurable

2017-07-18 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091821#comment-16091821 ] Ruslan Dautkhanov commented on SPARK-15703: --- [~tgraves], filed SPARK-21460. Thanks. > Make

[jira] [Commented] (SPARK-21460) Spark dynamic allocation breaks when ListenerBus event queue runs full

2017-07-18 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091943#comment-16091943 ] Ruslan Dautkhanov commented on SPARK-21460: --- [~zsxwing], according to [~tgraves] spark dynamic

[jira] [Commented] (SPARK-21460) Spark dynamic allocation breaks when ListenerBus event queue runs full

2017-07-18 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16091972#comment-16091972 ] Ruslan Dautkhanov commented on SPARK-21460: --- [~zsxwing] can we keep this jira open?

[jira] [Commented] (SPARK-21460) Spark dynamic allocation breaks when ListenerBus event queue runs full

2017-07-20 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16095131#comment-16095131 ] Ruslan Dautkhanov commented on SPARK-21460: --- [~Dhruve Ashar], I can email logs to you. Although

[jira] [Updated] (SPARK-21488) Make saveAsTable() and createOrReplaceTempView() return dataframe of created table/ created view

2017-07-20 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21488: -- Description: It would be great to make saveAsTable() return dataframe of created

[jira] [Created] (SPARK-21488) Make saveAsTable() return dataframe of created table

2017-07-20 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-21488: - Summary: Make saveAsTable() return dataframe of created table Key: SPARK-21488 URL: https://issues.apache.org/jira/browse/SPARK-21488 Project: Spark

[jira] [Updated] (SPARK-21488) Make saveAsTable() and createOrReplaceTempView() return dataframe of created table/ created view

2017-07-20 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21488: -- Summary: Make saveAsTable() and createOrReplaceTempView() return dataframe of created

[jira] [Commented] (SPARK-13534) Implement Apache Arrow serializer for Spark DataFrame for use in DataFrame.toPandas

2017-07-12 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16084886#comment-16084886 ] Ruslan Dautkhanov commented on SPARK-13534: --- [~bryanc], thanks for the feedback. We sometimes

[jira] [Commented] (SPARK-21488) Make saveAsTable() and createOrReplaceTempView() return dataframe of created table/ created view

2017-07-21 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16096354#comment-16096354 ] Ruslan Dautkhanov commented on SPARK-21488: --- Makes sense [~zsxwing], thank you. > Make

[jira] [Updated] (SPARK-21488) Make saveAsTable() and createOrReplaceTempView() return dataframe of created table/ created view

2017-07-21 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21488: -- Target Version/s: 3.0.0 > Make saveAsTable() and createOrReplaceTempView() return

[jira] [Created] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-06-30 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-21274: - Summary: Implement EXCEPT ALL and INTERSECT ALL Key: SPARK-21274 URL: https://issues.apache.org/jira/browse/SPARK-21274 Project: Spark Issue Type:

[jira] [Commented] (SPARK-13225) [SQL] Support Intersect All/Distinct

2017-06-30 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070881#comment-16070881 ] Ruslan Dautkhanov commented on SPARK-13225: --- Please consider this approach to implement

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-06-30 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070884#comment-16070884 ] Ruslan Dautkhanov commented on SPARK-21274: --- For INTERSECT ALL I was also experimenting with

[jira] [Commented] (SPARK-21274) Implement EXCEPT ALL and INTERSECT ALL

2017-06-30 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16070982#comment-16070982 ] Ruslan Dautkhanov commented on SPARK-21274: --- [~rxin], I wish I could. We only use PySpark and

[jira] [Commented] (SPARK-16803) SaveAsTable does not work when source DataFrame is built on a Hive Table

2017-07-05 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16075178#comment-16075178 ] Ruslan Dautkhanov commented on SPARK-16803: --- Any chance `saveAsTable` can be reverted to use

[jira] [Created] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-21657: - Summary: Spark has exponential time complexity to explode(array of structs) Key: SPARK-21657 URL: https://issues.apache.org/jira/browse/SPARK-21657

[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21657: -- Description: It can take up to half a day to explode a modest-sizes nested collection

[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21657: -- Attachment: ExponentialTimeGrowth.PNG nested-data-generator-and-test.py

[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21657: -- Description: It can take up to half a day to explode a modest-sizes nested collection

[jira] [Commented] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16117051#comment-16117051 ] Ruslan Dautkhanov commented on SPARK-21657: --- Absolutely, this is a real use case. We have a

[jira] [Updated] (SPARK-21657) Spark has exponential time complexity to explode(array of structs)

2017-08-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21657: -- Labels: cache caching collections nested_types performance pyspark sparksql sql (was:

[jira] [Commented] (SPARK-15214) Implement code generation for Generate

2017-08-23 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16137947#comment-16137947 ] Ruslan Dautkhanov commented on SPARK-15214: --- [~hvanhovell], please have a look at SPARK-21657

[jira] [Commented] (SPARK-15703) Make ListenerBus event queue size configurable

2017-05-16 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013208#comment-16013208 ] Ruslan Dautkhanov commented on SPARK-15703: --- We keep running into this issue too - would be

[jira] [Comment Edited] (SPARK-15703) Make ListenerBus event queue size configurable

2017-05-23 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013208#comment-16013208 ] Ruslan Dautkhanov edited comment on SPARK-15703 at 5/23/17 10:53 PM: -

[jira] [Commented] (SPARK-20776) Fix JobProgressListener perf. problems caused by empty TaskMetrics initialization

2017-05-18 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016168#comment-16016168 ] Ruslan Dautkhanov commented on SPARK-20776: --- Thank you [~joshrosen]. Would it be possible to

[jira] [Commented] (SPARK-18838) High latency of event processing for large jobs

2017-05-18 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16016664#comment-16016664 ] Ruslan Dautkhanov commented on SPARK-18838: --- my 2 cents. Would be nice to explore idea of

[jira] [Commented] (SPARK-16441) Spark application hang when dynamic allocation is enabled

2017-05-16 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16013247#comment-16013247 ] Ruslan Dautkhanov commented on SPARK-16441: --- We did not have

[jira] [Created] (SPARK-22081) Generalized Reduced Error Logistic Regression

2017-09-20 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-22081: - Summary: Generalized Reduced Error Logistic Regression Key: SPARK-22081 URL: https://issues.apache.org/jira/browse/SPARK-22081 Project: Spark

[jira] [Updated] (SPARK-22081) Generalized Reduced Error Logistic Regression

2017-09-20 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-22081: -- Attachment: RELR.GIF > Generalized Reduced Error Logistic Regression >

[jira] [Updated] (SPARK-22081) Generalized Reduced Error Logistic Regression

2017-09-20 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-22081: -- Description: Our SAS gurus are saying they would love to have "Generalized Reduced

[jira] [Created] (SPARK-21931) add LNNVL function

2017-09-05 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-21931: - Summary: add LNNVL function Key: SPARK-21931 URL: https://issues.apache.org/jira/browse/SPARK-21931 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-21931) add LNNVL function

2017-09-05 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21931: -- Attachment: Capture1.JPG > add LNNVL function > -- > >

[jira] [Updated] (SPARK-21931) add LNNVL function

2017-09-05 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21931: -- Description: Purpose LNNVL provides a concise way to evaluate a condition when one or

[jira] [Updated] (SPARK-21931) add LNNVL function

2017-09-05 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-21931: -- Description: Purpose LNNVL provides a concise way to evaluate a condition when one or

[jira] [Commented] (SPARK-21931) add LNNVL function

2017-09-06 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16155797#comment-16155797 ] Ruslan Dautkhanov commented on SPARK-21931: --- Example 1) {code:sql} select * from products

[jira] [Created] (SPARK-21978) schemaInference option not to convert strings with leading zeros to int/long

2017-09-11 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-21978: - Summary: schemaInference option not to convert strings with leading zeros to int/long Key: SPARK-21978 URL: https://issues.apache.org/jira/browse/SPARK-21978

[jira] [Commented] (SPARK-21213) Support collecting partition-level statistics: rowCount and sizeInBytes

2017-10-17 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16208027#comment-16208027 ] Ruslan Dautkhanov commented on SPARK-21213: --- Would the partition-level stats be compatible with

[jira] [Created] (SPARK-22505) toDF() / createDataFrame() type inference doesn't work as expected

2017-11-12 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-22505: - Summary: toDF() / createDataFrame() type inference doesn't work as expected Key: SPARK-22505 URL: https://issues.apache.org/jira/browse/SPARK-22505

[jira] [Updated] (SPARK-22505) toDF() / createDataFrame() type inference doesn't work as expected

2017-11-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-22505: -- Description: {code} df =

[jira] [Updated] (SPARK-22505) toDF() / createDataFrame() type inference doesn't work as expected

2017-11-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-22505: -- Description: {code} df =

[jira] [Updated] (SPARK-22505) toDF() / createDataFrame() type inference doesn't work as expected

2017-11-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-22505: -- Description: {code} df =

[jira] [Commented] (SPARK-22505) toDF() / createDataFrame() type inference doesn't work as expected

2017-11-22 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263733#comment-16263733 ] Ruslan Dautkhanov commented on SPARK-22505: --- that's great. thank you [~hyukjin.kwon] > toDF()

  1   2   3   >