[jira] [Reopened] (SPARK-26594) DataSourceOptions.asMap should return CaseInsensitiveMap

2019-07-16 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim reopened SPARK-26594: -- > DataSourceOptions.asMap should return CaseInsensitiveMap >

[jira] [Resolved] (SPARK-26594) DataSourceOptions.asMap should return CaseInsensitiveMap

2019-07-16 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-26594. -- Resolution: Duplicate > DataSourceOptions.asMap should return CaseInsensitiveMap >

[jira] [Resolved] (SPARK-26594) DataSourceOptions.asMap should return CaseInsensitiveMap

2019-07-16 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-26594. -- Resolution: Not A Problem SPARK-27106 supersedes it. Closing. > DataSourceOptions.asMap

[jira] [Updated] (SPARK-28366) Logging in driver when loading single large unsplittable file

2019-07-16 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weichen Xu updated SPARK-28366: --- Summary: Logging in driver when loading single large unsplittable file (was: Logging in driver

[jira] [Updated] (SPARK-26021) -0.0 and 0.0 not treated consistently, doesn't match Hive

2019-07-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-26021: --- Labels: correctness (was: ) > -0.0 and 0.0 not treated consistently, doesn't match Hive >

[jira] [Updated] (SPARK-26352) join reordering should not change the order of output attributes

2019-07-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-26352: --- Labels: correctness (was: ) > join reordering should not change the order of output attributes >

[jira] [Updated] (SPARK-27485) EnsureRequirements.reorder should handle duplicate expressions gracefully

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27485: -- Fix Version/s: 2.4.4 > EnsureRequirements.reorder should handle duplicate expressions

[jira] [Updated] (SPARK-26864) Query may return incorrect result when python udf is used as a join condition and the udf uses attributes from both legs of left semi join.

2019-07-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-26864: --- Labels: correctness (was: ) > Query may return incorrect result when python udf is used as a join

[jira] [Updated] (SPARK-27134) array_distinct function does not work correctly with columns containing array of array

2019-07-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27134: --- Labels: correctness (was: ) > array_distinct function does not work correctly with columns

[jira] [Resolved] (SPARK-27963) Allow dynamic allocation without an external shuffle service

2019-07-16 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-27963. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24817

[jira] [Assigned] (SPARK-27963) Allow dynamic allocation without an external shuffle service

2019-07-16 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-27963: -- Assignee: Marcelo Vanzin > Allow dynamic allocation without an external shuffle

[jira] [Resolved] (SPARK-18299) Allow more aggregations on KeyValueGroupedDataset

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-18299. --- Resolution: Fixed Fix Version/s: 3.0.0 This is resolved via

[jira] [Comment Edited] (SPARK-27416) UnsafeMapData & UnsafeArrayData Kryo serialization breaks when two machines have different Oops size

2019-07-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886535#comment-16886535 ] Josh Rosen edited comment on SPARK-27416 at 7/16/19 10:56 PM: -- I think that

[jira] [Commented] (SPARK-27416) UnsafeMapData & UnsafeArrayData Kryo serialization breaks when two machines have different Oops size

2019-07-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886535#comment-16886535 ] Josh Rosen commented on SPARK-27416: I think that we should backport this for Spark 2.4.4 because

[jira] [Comment Edited] (SPARK-27406) UnsafeArrayData serialization breaks when two machines have different Oops size

2019-07-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886533#comment-16886533 ] Josh Rosen edited comment on SPARK-27406 at 7/16/19 10:51 PM: -- Adding the

[jira] [Updated] (SPARK-27406) UnsafeArrayData serialization breaks when two machines have different Oops size

2019-07-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-27406: --- Labels: correctness (was: ) Adding the 'correctness' label to this fixed issue because the related

[jira] [Updated] (SPARK-10914) UnsafeRow serialization breaks when two machines have different Oops size

2019-07-16 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-10914: --- Labels: correctness (was: ) > UnsafeRow serialization breaks when two machines have different Oops

[jira] [Commented] (SPARK-27911) PySpark Packages should automatically choose correct scala version

2019-07-16 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886479#comment-16886479 ] Michael Armbrust commented on SPARK-27911: -- You are right, there is nothing pyspark specific

[jira] [Created] (SPARK-28417) Spark Submit does not use Proxy User Credentials to Resolve Path for Resources

2019-07-16 Thread Abhishek Modi (JIRA)
Abhishek Modi created SPARK-28417: - Summary: Spark Submit does not use Proxy User Credentials to Resolve Path for Resources Key: SPARK-28417 URL: https://issues.apache.org/jira/browse/SPARK-28417

[jira] [Commented] (SPARK-28182) Spark fails to download Hive 2.2 and 2.3 jars from maven

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886470#comment-16886470 ] Dongjoon Hyun commented on SPARK-28182: --- Thank you for reporting, [~emlyn]. This is hard to find

[jira] [Updated] (SPARK-28182) Spark fails to download Hive 2.2 and 2.3 jars from maven

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28182: -- Summary: Spark fails to download Hive 2.2 and 2.3 jars from maven (was: Spark fails to

[jira] [Updated] (SPARK-27485) EnsureRequirements.reorder should handle duplicate expressions gracefully

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27485: -- Summary: EnsureRequirements.reorder should handle duplicate expressions gracefully (was:

[jira] [Created] (SPARK-28416) Use java.time API in timestampAddInterval

2019-07-16 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-28416: -- Summary: Use java.time API in timestampAddInterval Key: SPARK-28416 URL: https://issues.apache.org/jira/browse/SPARK-28416 Project: Spark Issue Type: Sub-task

[jira] [Commented] (SPARK-23758) MLlib 2.4 Roadmap

2019-07-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886406#comment-16886406 ] Marco Gaido commented on SPARK-23758: - [~dongjoon] seems weird to set the affected version to 3.0

[jira] [Commented] (SPARK-28269) Pandas Grouped Map UDF can get deadlocked

2019-07-16 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886400#comment-16886400 ] Bryan Cutler commented on SPARK-28269: -- cc [~icexelloss] > Pandas Grouped Map UDF can get

[jira] [Updated] (SPARK-28269) Pandas Grouped Map UDF can get deadlocked

2019-07-16 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler updated SPARK-28269: - Summary: Pandas Grouped Map UDF can get deadlocked (was: ArrowStreamPandasSerializer get

[jira] [Created] (SPARK-28415) Add messageHandler to Kafka 10 direct stream API

2019-07-16 Thread Michael Spector (JIRA)
Michael Spector created SPARK-28415: --- Summary: Add messageHandler to Kafka 10 direct stream API Key: SPARK-28415 URL: https://issues.apache.org/jira/browse/SPARK-28415 Project: Spark Issue

[jira] [Commented] (SPARK-28086) Adds `random()` sql function

2019-07-16 Thread Dylan Guedes (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886382#comment-16886382 ] Dylan Guedes commented on SPARK-28086: -- Well, to be fair I've created the JIRA because the `rand()`

[jira] [Commented] (SPARK-28411) insertInto with overwrite inconsistent behaviour Python/Scala

2019-07-16 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886381#comment-16886381 ] Huaxin Gao commented on SPARK-28411: I am working on this. Will submit a PR soon.  > insertInto

[jira] [Commented] (SPARK-28086) Adds `random()` sql function

2019-07-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886374#comment-16886374 ] Sean Owen commented on SPARK-28086: --- Yeah if this is just an alias... OK that seems simple but is this

[jira] [Commented] (SPARK-28134) Trigonometric Functions

2019-07-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886371#comment-16886371 ] Sean Owen commented on SPARK-28134: --- I'm not sure this is worth it. These just take degrees as an

[jira] [Commented] (SPARK-27812) kubernetes client import non-daemon thread which block jvm exit.

2019-07-16 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886359#comment-16886359 ] Imran Rashid commented on SPARK-27812: -- as mentioned elsewhere, installing an uncaught exception

[jira] [Resolved] (SPARK-27959) Change YARN resource configs to use .amount

2019-07-16 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-27959. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24989

[jira] [Commented] (SPARK-23443) Spark with Glue as external catalog

2019-07-16 Thread Devin Boyer (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886352#comment-16886352 ] Devin Boyer commented on SPARK-23443: - FWIW, a little while back AWS released their implementation

[jira] [Assigned] (SPARK-27959) Change YARN resource configs to use .amount

2019-07-16 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-27959: -- Assignee: Thomas Graves > Change YARN resource configs to use .amount >

[jira] [Created] (SPARK-28414) Standalone worker/master UI updates for Resource scheduling

2019-07-16 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-28414: - Summary: Standalone worker/master UI updates for Resource scheduling Key: SPARK-28414 URL: https://issues.apache.org/jira/browse/SPARK-28414 Project: Spark

[jira] [Updated] (SPARK-28182) Spark fails to download Hive 2.2+ jars from maven

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28182: -- Component/s: PySpark > Spark fails to download Hive 2.2+ jars from maven >

[jira] [Updated] (SPARK-27929) make sql percentile function be able to receive double frq

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27929: -- Affects Version/s: (was: 2.4.3) 3.0.0 > make sql percentile

[jira] [Updated] (SPARK-27568) readLock leaked when method take() called on a cached rdd

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27568: -- Affects Version/s: (was: 2.4.0) (was: 2.3.0)

[jira] [Updated] (SPARK-28227) Spark can’t support TRANSFORM with aggregation

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28227: -- Affects Version/s: (was: 2.4.0) (was: 1.6.0)

[jira] [Updated] (SPARK-28360) The serviceAccountName configuration item does not take effect in client mode.

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28360: -- Affects Version/s: (was: 2.4.3) (was: 2.4.2)

[jira] [Updated] (SPARK-26769) partition prunning in inner join

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26769: -- Affects Version/s: (was: 2.4.0) 3.0.0 > partition prunning in

[jira] [Updated] (SPARK-25034) possible triple memory consumption in fetchBlockSync()

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25034: -- Affects Version/s: (was: 2.4.0) (was: 2.2.2)

[jira] [Updated] (SPARK-27388) expression encoder for avro objects

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27388: -- Affects Version/s: (was: 2.4.1) 3.0.0 > expression encoder for

[jira] [Updated] (SPARK-25166) Reduce the number of write operations for shuffle write.

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25166: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Reduce the number of write

[jira] [Updated] (SPARK-25285) Add executor task metrics to track the number of tasks started and of tasks successfully completed

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25285: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Add executor task metrics

[jira] [Updated] (SPARK-26302) retainedBatches configuration can eat up memory on driver

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26302: -- Affects Version/s: (was: 2.4.0) 3.0.0 > retainedBatches

[jira] [Updated] (SPARK-23967) Description add native sql show in SQL page.

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23967: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Description add native sql

[jira] [Updated] (SPARK-25513) Read zipped CSV and JSON files

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25513: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Read zipped CSV and JSON

[jira] [Updated] (SPARK-27669) Refactor DataFrameWriter to resolve datasources in a command

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27669: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Refactor DataFrameWriter

[jira] [Updated] (SPARK-25963) Optimize generate followed by window

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25963: -- Affects Version/s: (was: 2.4.0) (was: 2.3.0)

[jira] [Updated] (SPARK-25937) Support user-defined schema in Kafka Source & Sink

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25937: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Support user-defined

[jira] [Updated] (SPARK-24780) DataFrame.column_name should resolve to a distinct ref

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24780: -- Affects Version/s: (was: 2.4.0) 3.0.0 > DataFrame.column_name

[jira] [Updated] (SPARK-27583) Optimize strongly connected components

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27583: -- Affects Version/s: (was: 2.4.2) 3.0.0 > Optimize strongly

[jira] [Updated] (SPARK-26173) Prior regularization for Logistic Regression

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26173: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Prior regularization for

[jira] [Updated] (SPARK-24148) Adding Ability to Specify SQL Type of Empty Arrays

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24148: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Adding Ability to Specify

[jira] [Updated] (SPARK-24625) put all the backward compatible behavior change configs under spark.sql.legacy.*

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24625: -- Affects Version/s: (was: 2.4.0) 3.0.0 > put all the backward

[jira] [Updated] (SPARK-24513) Attribute support in UnaryTransformer

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24513: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Attribute support in

[jira] [Updated] (SPARK-25083) remove the type erasure hack in data source scan

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25083: -- Affects Version/s: (was: 2.4.0) 3.0.0 > remove the type erasure

[jira] [Updated] (SPARK-25236) Investigate using a logging library inside of PySpark on the workers instead of print

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25236: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Investigate using a

[jira] [Updated] (SPARK-27171) Support Full-Partiton limit in the first scan

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27171: -- Affects Version/s: (was: 2.3.2) (was: 2.4.0)

[jira] [Updated] (SPARK-26104) make pci devices visible to task scheduler

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26104: -- Affects Version/s: (was: 2.4.0) 3.0.0 > make pci devices visible

[jira] [Updated] (SPARK-25239) Spark Streaming for Kafka should allow uniform batch size per partition for streaming RDD

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25239: -- Affects Version/s: (was: 2.4.0) (was: 2.2.0)

[jira] [Updated] (SPARK-28069) Switch log directory from Spark UI without restarting history server

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28069: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Switch log directory from

[jira] [Updated] (SPARK-20901) Feature parity for ORC with Parquet

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-20901: -- Affects Version/s: (was: 2.4.0) (was: 2.2.1)

[jira] [Updated] (SPARK-28097) Map ByteType to SMALLINT when using JDBC with PostgreSQL

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28097: -- Affects Version/s: (was: 2.4.3) (was: 2.3.3)

[jira] [Updated] (SPARK-27734) Add memory based thresholds for shuffle spill

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27734: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Add memory based

[jira] [Updated] (SPARK-25569) Failing a Spark job when an accumulator cannot be updated

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25569: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Failing a Spark job when

[jira] [Updated] (SPARK-28158) Hive UDFs supports UDT type

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28158: -- Affects Version/s: (was: 2.4.3) > Hive UDFs supports UDT type >

[jira] [Updated] (SPARK-26439) Introduce WorkerOffer reservation mechanism for Barrier TaskSet

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26439: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Introduce WorkerOffer

[jira] [Updated] (SPARK-27841) Improve UTF8String fromString()/toString()/numChars() performance when strings are ASCII

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27841: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Improve UTF8String

[jira] [Updated] (SPARK-27865) Spark SQL support 1:N sort merge bucket join without shuffle

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27865: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Spark SQL support 1:N sort

[jira] [Updated] (SPARK-27561) Support "lateral column alias references" to allow column aliases to be used within SELECT clauses

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27561: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Support "lateral column

[jira] [Updated] (SPARK-27616) Standalone cluster management user resource allocation

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27616: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Standalone cluster

[jira] [Updated] (SPARK-24432) Add support for dynamic resource allocation

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24432: -- Affects Version/s: (was: 2.4.0) > Add support for dynamic resource allocation >

[jira] [Updated] (SPARK-24994) When the data type of the field is converted to other types, it can also support pushdown to parquet

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24994: -- Affects Version/s: (was: 2.4.0) 3.0.0 > When the data type of the

[jira] [Updated] (SPARK-27808) Ability to ignore existing files for structured streaming

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27808: -- Affects Version/s: (was: 2.4.3) (was: 2.3.3)

[jira] [Updated] (SPARK-27789) Use stopEarly in codegen of ColumnarBatchScan

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27789: -- Affects Version/s: (was: 2.4.3) (was: 2.4.2)

[jira] [Updated] (SPARK-28239) Allow TCP connections created by shuffle service auto close on YARN NodeManagers

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28239: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Allow TCP connections

[jira] [Updated] (SPARK-24184) Allow escape comma in spark files

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24184: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Allow escape comma in

[jira] [Updated] (SPARK-27506) Function `from_avro` doesn't allow deserialization of data using other compatible schemas

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27506: -- Affects Version/s: (was: 2.4.1) 3.0.0 > Function `from_avro`

[jira] [Updated] (SPARK-27962) Propagate subprocess stdout when subprocess exits with nonzero status in deploy.RRunner

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27962: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Propagate subprocess

[jira] [Updated] (SPARK-24632) Allow 3rd-party libraries to use pyspark.ml abstractions for Java wrappers for persistence

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24632: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Allow 3rd-party libraries

[jira] [Updated] (SPARK-25121) Support multi-part column name for hint resolution

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25121: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Support multi-part column

[jira] [Updated] (SPARK-25634) New Metrics in External Shuffle Service to help identify abusing application

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25634: -- Affects Version/s: (was: 2.4.0) 3.0.0 > New Metrics in External

[jira] [Updated] (SPARK-26912) Allow setting permission for event_log

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26912: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Allow setting permission

[jira] [Updated] (SPARK-27679) Improve queries with LIKE expression

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27679: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Improve queries with LIKE

[jira] [Updated] (SPARK-24528) Missing optimization for Aggregations/Windowing on a bucketed table

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24528: -- Affects Version/s: (was: 2.4.0) (was: 2.3.0)

[jira] [Updated] (SPARK-27790) Support SQL INTERVAL types

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27790: -- Affects Version/s: (was: 2.4.3) 3.0.0 > Support SQL INTERVAL types

[jira] [Updated] (SPARK-25894) Include a count of the number of physical columns read for a columnar data source in the metadata of FileSourceScanExec

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25894: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Include a count of the

[jira] [Updated] (SPARK-28070) writeType and writeObject in SparkR should be handled by S3 methods

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-28070: -- Affects Version/s: (was: 2.4.3) 3.0.0 > writeType and writeObject

[jira] [Updated] (SPARK-23678) a more efficient partition strategy

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-23678: -- Affects Version/s: (was: 2.4.0) 3.0.0 > a more efficient partition

[jira] [Updated] (SPARK-24914) totalSize is not a good estimate for broadcast joins

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24914: -- Affects Version/s: (was: 2.4.0) 3.0.0 > totalSize is not a good

[jira] [Updated] (SPARK-25927) Fix number of partitions returned by outputPartitioning

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25927: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Fix number of partitions

[jira] [Updated] (SPARK-26594) DataSourceOptions.asMap should return CaseInsensitiveMap

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26594: -- Affects Version/s: (was: 2.4.0) 3.0.0 > DataSourceOptions.asMap

[jira] [Updated] (SPARK-26373) Spark UI 'environment' tab - column to indicate default vs overridden values

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26373: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Spark UI 'environment' tab

[jira] [Updated] (SPARK-26268) Decouple shuffle data from Spark deployment

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26268: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Decouple shuffle data from

[jira] [Updated] (SPARK-27670) Add High available for Spark Hive thrift server.

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27670: -- Affects Version/s: (was: 2.4.0) (was: 2.3.0)

[jira] [Updated] (SPARK-27799) Allow SerializerManager.canUseKryo whitelist to be extended via a configuration

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-27799: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Allow

[jira] [Updated] (SPARK-25555) Generic constructs for windowing and support for custom windows

2019-07-16 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-2: -- Affects Version/s: (was: 2.4.0) 3.0.0 > Generic constructs for

  1   2   3   >