[jira] [Commented] (SPARK-25474) Size in bytes of the query is coming in EB in case of parquet datasource

2018-09-19 Thread sandeep katta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621541#comment-16621541 ] sandeep katta commented on SPARK-25474: --- okay I would like to check this issue and raise PR >

[jira] [Commented] (SPARK-25474) Size in bytes of the query is coming in EB in case of parquet datasource

2018-09-19 Thread Ayush Anubhava (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621539#comment-16621539 ] Ayush Anubhava commented on SPARK-25474: It behaves the same even if fall back to hdfs is

[jira] [Commented] (SPARK-25475) Refactor all benchmark to save the result as a separate file

2018-09-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621538#comment-16621538 ] Wenchen Fan commented on SPARK-25475: - I'd suggest one benchmark one PR. > Refactor all benchmark

[jira] [Commented] (SPARK-25475) Refactor all benchmark to save the result as a separate file

2018-09-19 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621536#comment-16621536 ] Yuming Wang commented on SPARK-25475: - [~dongjoon] How should we create pull requests? One benchmark

[jira] [Updated] (SPARK-25475) Refactor all benchmark to save the result as a separate file

2018-09-19 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25475: -- Description: This is an umbrella issue to refactor all benchmarks to use a common style

[jira] [Updated] (SPARK-25475) Refactor all benchmark to save the result as a separate file

2018-09-19 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25475: -- Description: This is an umbrella issue to refactor all benchmarks to use a common style

[jira] [Commented] (SPARK-25474) Size in bytes of the query is coming in EB in case of parquet datasource

2018-09-19 Thread sandeep katta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621535#comment-16621535 ] sandeep katta commented on SPARK-25474: --- what is the behavior for

[jira] [Created] (SPARK-25475) Refactor all benchmark to save the result as a separate file

2018-09-19 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-25475: - Summary: Refactor all benchmark to save the result as a separate file Key: SPARK-25475 URL: https://issues.apache.org/jira/browse/SPARK-25475 Project: Spark

[jira] [Updated] (SPARK-25339) Refactor FilterPushdownBenchmark to use main method

2018-09-19 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25339: -- Issue Type: Sub-task (was: Test) Parent: SPARK-25475 > Refactor

[jira] [Commented] (SPARK-10816) EventTime based sessionization

2018-09-19 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621512#comment-16621512 ] Jungtaek Lim commented on SPARK-10816: -- WIP version of patch:

[jira] [Updated] (SPARK-10816) EventTime based sessionization

2018-09-19 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-10816: - Attachment: SPARK-10816 Support session window natively.pdf > EventTime based sessionization >

[jira] [Assigned] (SPARK-25339) Refactor FilterPushdownBenchmark to use main method

2018-09-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25339: --- Assignee: Yuming Wang > Refactor FilterPushdownBenchmark to use main method >

[jira] [Updated] (SPARK-10816) EventTime based sessionization

2018-09-19 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-10816: - Attachment: (was: SPARK-10816_ Support session window natively.pdf) > EventTime based

[jira] [Resolved] (SPARK-25339) Refactor FilterPushdownBenchmark to use main method

2018-09-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25339. - Resolution: Fixed Fix Version/s: 2.5.0 Issue resolved by pull request 22443

[jira] [Created] (SPARK-25474) Size in bytes of the query is coming in EB in case of parquet datasource

2018-09-19 Thread Ayush Anubhava (JIRA)
Ayush Anubhava created SPARK-25474: -- Summary: Size in bytes of the query is coming in EB in case of parquet datasource Key: SPARK-25474 URL: https://issues.apache.org/jira/browse/SPARK-25474

[jira] [Commented] (SPARK-10816) EventTime based sessionization

2018-09-19 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621497#comment-16621497 ] Jungtaek Lim commented on SPARK-10816: -- Just attached the doc. I also have shareable version of doc

[jira] [Updated] (SPARK-10816) EventTime based sessionization

2018-09-19 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim updated SPARK-10816: - Attachment: SPARK-10816_ Support session window natively.pdf > EventTime based sessionization >

[jira] [Resolved] (SPARK-23648) extend hint syntax to support any expression for R

2018-09-19 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-23648. -- Resolution: Fixed Assignee: Huaxin Gao Fix Version/s: 2.5.0 > extend hint

[jira] [Commented] (SPARK-10816) EventTime based sessionization

2018-09-19 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621483#comment-16621483 ] Jungtaek Lim commented on SPARK-10816: -- I would like to propose native support on session window,

[jira] [Resolved] (SPARK-25470) Eval method of Concat expression should call pattern matching only once

2018-09-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-25470. -- Resolution: Duplicate > Eval method of Concat expression should call pattern matching only

[jira] [Commented] (SPARK-25469) Eval method of Concat expression should call pattern matching only once

2018-09-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621462#comment-16621462 ] Hyukjin Kwon commented on SPARK-25469: -- Please fill the JIRA description > Eval method of Concat

[jira] [Commented] (SPARK-23715) from_utc_timestamp returns incorrect results for some UTC date/time values

2018-09-19 Thread Bruce Robbins (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621452#comment-16621452 ] Bruce Robbins commented on SPARK-23715: --- Hi [~rxin], Thanks for following up with me. This is a

[jira] [Updated] (SPARK-25473) PySpark ForeachWriter test fails on Python 3.6 and macOS High Serria

2018-09-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-25473: - Summary: PySpark ForeachWriter test fails on Python 3.6 and macOS High Serria (was: PySpark

[jira] [Created] (SPARK-25473) PySpark ForeachWriter test fails on Python 3.6 and mocOS High Serria

2018-09-19 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-25473: Summary: PySpark ForeachWriter test fails on Python 3.6 and mocOS High Serria Key: SPARK-25473 URL: https://issues.apache.org/jira/browse/SPARK-25473 Project: Spark

[jira] [Resolved] (SPARK-25457) IntegralDivide (div) should not always return long

2018-09-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25457. - Resolution: Fixed Fix Version/s: 2.5.0 Issue resolved by pull request 22465

[jira] [Assigned] (SPARK-25457) IntegralDivide (div) should not always return long

2018-09-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25457: --- Assignee: Marco Gaido > IntegralDivide (div) should not always return long >

[jira] [Commented] (SPARK-25432) Consider if using standard getOrCreate from PySpark into JVM SparkSession would simplify code

2018-09-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621416#comment-16621416 ] Kazuaki Ishizaki commented on SPARK-25432: -- nit: description seems to be in {{environment}}

[jira] [Issue Comment Deleted] (SPARK-25437) Using OpenHashMap replace HashMap improve Encoder Performance

2018-09-19 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki updated SPARK-25437: - Comment: was deleted (was: Is such a feature for major release, not for maintenance

[jira] [Commented] (SPARK-24523) InterruptedException when closing SparkContext

2018-09-19 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621410#comment-16621410 ] Imran Rashid commented on SPARK-24523: -- When you say the SparkContext is automatically stopped, you

[jira] [Commented] (SPARK-23715) from_utc_timestamp returns incorrect results for some UTC date/time values

2018-09-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621405#comment-16621405 ] Reynold Xin commented on SPARK-23715: - Also I reject the notion that the old behavior was

[jira] [Commented] (SPARK-23715) from_utc_timestamp returns incorrect results for some UTC date/time values

2018-09-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621395#comment-16621395 ] Reynold Xin commented on SPARK-23715: - [~bersprockets] i think we should revert the change while we

[jira] [Commented] (SPARK-20598) Iterative checkpoints do not get removed from HDFS

2018-09-19 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621396#comment-16621396 ] holdenk commented on SPARK-20598: - Huh that's interesting.I suspect that could be we're keeping the

[jira] [Commented] (SPARK-25467) Python date/datetime objects in dataframes increment by 1 day when converted to JSON

2018-09-19 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621391#comment-16621391 ] holdenk commented on SPARK-25467: - cc [~bryanc] > Python date/datetime objects in dataframes increment

[jira] [Assigned] (SPARK-14352) approxQuantile should support multi columns

2018-09-19 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-14352: --- Assignee: zhengruifeng > approxQuantile should support multi columns >

[jira] [Commented] (SPARK-17602) PySpark - Performance Optimization Large Size of Broadcast Variable

2018-09-19 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621389#comment-16621389 ] holdenk commented on SPARK-17602: - Did we end up going anywhere with this? > PySpark - Performance

[jira] [Resolved] (SPARK-25471) Fix tests for Python 3.6 with Pandas 0.23+

2018-09-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-25471. -- Resolution: Fixed Fix Version/s: 2.4.0 3.0.0

[jira] [Assigned] (SPARK-25471) Fix tests for Python 3.6 with Pandas 0.23+

2018-09-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-25471: Assignee: Bryan Cutler > Fix tests for Python 3.6 with Pandas 0.23+ >

[jira] [Resolved] (SPARK-14352) approxQuantile should support multi columns

2018-09-19 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-14352. - Resolution: Fixed Target Version/s: 2.2.0 > approxQuantile should support multi columns >

[jira] [Assigned] (SPARK-25472) Structured Streaming query.stop() doesn't always stop gracefully

2018-09-19 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz reassigned SPARK-25472: --- Assignee: Burak Yavuz > Structured Streaming query.stop() doesn't always stop gracefully >

[jira] [Updated] (SPARK-25425) Extra options must overwrite sessions options

2018-09-19 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25425: -- Fix Version/s: (was: 2.5.0) 2.4.0 > Extra options must overwrite

[jira] [Commented] (SPARK-24309) AsyncEventQueue should handle an interrupt from a Listener

2018-09-19 Thread Umayr Hassan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621317#comment-16621317 ] Umayr Hassan commented on SPARK-24309: -- Hi folks. I'm not sure this - or a similar - issue is

[jira] [Updated] (SPARK-25021) Add spark.executor.pyspark.memory support to Kubernetes

2018-09-19 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-25021: Fix Version/s: 2.4.0 > Add spark.executor.pyspark.memory support to Kubernetes >

[jira] [Updated] (SPARK-24523) InterruptedException when closing SparkContext

2018-09-19 Thread Umayr Hassan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umayr Hassan updated SPARK-24523: - Affects Version/s: 2.3.1 > InterruptedException when closing SparkContext >

[jira] [Updated] (SPARK-24523) InterruptedException when closing SparkContext

2018-09-19 Thread Umayr Hassan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umayr Hassan updated SPARK-24523: - Environment: EMR 5.14.0, S3/HDFS inputs and outputs; EMR 5.17       was: EMR 5.14.0, 

[jira] [Commented] (SPARK-24523) InterruptedException when closing SparkContext

2018-09-19 Thread Umayr Hassan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621281#comment-16621281 ] Umayr Hassan commented on SPARK-24523: -- [~dongjoon] We still see this in Spark 2.3.1 (EMR 5.17). As

[jira] [Updated] (SPARK-24360) Support Hive 3.1 metastore

2018-09-19 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-24360: -- Target Version/s: 2.5.0 > Support Hive 3.1 metastore > -- > >

[jira] [Created] (SPARK-25472) Structured Streaming query.stop() doesn't always stop gracefully

2018-09-19 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-25472: --- Summary: Structured Streaming query.stop() doesn't always stop gracefully Key: SPARK-25472 URL: https://issues.apache.org/jira/browse/SPARK-25472 Project: Spark

[jira] [Assigned] (SPARK-25471) Fix tests for Python 3.6 with Pandas 0.23+

2018-09-19 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bryan Cutler reassigned SPARK-25471: Assignee: (was: Bryan Cutler) > Fix tests for Python 3.6 with Pandas 0.23+ >

[jira] [Created] (SPARK-25471) Fix tests for Python 3.6 with Pandas 0.23+

2018-09-19 Thread Bryan Cutler (JIRA)
Bryan Cutler created SPARK-25471: Summary: Fix tests for Python 3.6 with Pandas 0.23+ Key: SPARK-25471 URL: https://issues.apache.org/jira/browse/SPARK-25471 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-24360) Support Hive 3.1 metastore

2018-09-19 Thread t oo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621218#comment-16621218 ] t oo commented on SPARK-24360: -- bump > Support Hive 3.1 metastore > -- > >

[jira] [Commented] (SPARK-25164) Parquet reader builds entire list of columns once for each column

2018-09-19 Thread Bruce Robbins (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16621125#comment-16621125 ] Bruce Robbins commented on SPARK-25164: --- {quote}I am thinking if it's feasible to lazily realize

[jira] [Updated] (SPARK-25449) Don't send zero accumulators in heartbeats

2018-09-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-25449: - Issue Type: Improvement (was: Task) > Don't send zero accumulators in heartbeats >

[jira] [Updated] (SPARK-25449) Don't send zero accumulators in heartbeats

2018-09-19 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-25449: - Target Version/s: (was: 2.5.0) > Don't send zero accumulators in heartbeats >

[jira] [Resolved] (SPARK-15420) Repartition and sort before Parquet writes

2018-09-19 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue resolved SPARK-15420. --- Resolution: Won't Fix > Repartition and sort before Parquet writes >

[jira] [Updated] (SPARK-15420) Repartition and sort before Parquet writes

2018-09-19 Thread Ryan Blue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan Blue updated SPARK-15420: -- Target Version/s: (was: 2.4.0) > Repartition and sort before Parquet writes >

[jira] [Created] (SPARK-25469) Eval method of Concat expression should call pattern matching only once

2018-09-19 Thread Marek Novotny (JIRA)
Marek Novotny created SPARK-25469: - Summary: Eval method of Concat expression should call pattern matching only once Key: SPARK-25469 URL: https://issues.apache.org/jira/browse/SPARK-25469 Project:

[jira] [Created] (SPARK-25470) Eval method of Concat expression should call pattern matching only once

2018-09-19 Thread Marek Novotny (JIRA)
Marek Novotny created SPARK-25470: - Summary: Eval method of Concat expression should call pattern matching only once Key: SPARK-25470 URL: https://issues.apache.org/jira/browse/SPARK-25470 Project:

[jira] [Resolved] (SPARK-25414) make it clear that the numRows metrics should be counted for each scan of the source

2018-09-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25414. - Resolution: Fixed Fix Version/s: 2.5.0 > make it clear that the numRows metrics should

***UNCHECKED*** [jira] [Updated] (SPARK-25468) Highlight current page index in the history server

2018-09-19 Thread Dhruve Ashar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dhruve Ashar updated SPARK-25468: - Attachment: SparkHistoryServer.png > Highlight current page index in the history server >

[jira] [Updated] (SPARK-25468) Highlight current page index in the history server

2018-09-19 Thread Dhruve Ashar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dhruve Ashar updated SPARK-25468: - Description: Spark History Server Web UI should highlight the current page index selected for

[jira] [Updated] (SPARK-25468) Highlight current page index in the history server

2018-09-19 Thread Dhruve Ashar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dhruve Ashar updated SPARK-25468: - Description: Spark History Server Web UI should highlight the current page index selected for

[jira] [Created] (SPARK-25468) Highlight current page index in the history server

2018-09-19 Thread Dhruve Ashar (JIRA)
Dhruve Ashar created SPARK-25468: Summary: Highlight current page index in the history server Key: SPARK-25468 URL: https://issues.apache.org/jira/browse/SPARK-25468 Project: Spark Issue

[jira] [Commented] (SPARK-24434) Support user-specified driver and executor pod templates

2018-09-19 Thread Rob Vesse (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620809#comment-16620809 ] Rob Vesse commented on SPARK-24434: --- Started a mailing list thread re: the limitations of this as

[jira] [Updated] (SPARK-25467) Python date/datetime objects in dataframes increment by 1 day when converted to JSON

2018-09-19 Thread David V. Hill (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David V. Hill updated SPARK-25467: -- Description: When Dataframes contains datetime.date or datetime.datetime instances and

[jira] [Comment Edited] (SPARK-20236) Overwrite a partitioned data source table should only overwrite related partitions

2018-09-19 Thread Deepanker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619236#comment-16619236 ] Deepanker edited comment on SPARK-20236 at 9/19/18 3:03 PM: Hi Wenchen Fan,

[jira] [Updated] (SPARK-25467) Python date/datetime objects in dataframes increment by 1 day when converted to JSON

2018-09-19 Thread David V. Hill (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David V. Hill updated SPARK-25467: -- Description: When Dataframes contains datetime.date or datetime.datetime objects and

[jira] [Created] (SPARK-25467) Python date/datetime objects in dataframes increment by 1 day when converted to JSON

2018-09-19 Thread David V. Hill (JIRA)
David V. Hill created SPARK-25467: - Summary: Python date/datetime objects in dataframes increment by 1 day when converted to JSON Key: SPARK-25467 URL: https://issues.apache.org/jira/browse/SPARK-25467

[jira] [Commented] (SPARK-25422) flaky test: org.apache.spark.DistributedSuite.caching on disk, replicated (encryption = on) (with replication as stream)

2018-09-19 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620666#comment-16620666 ] Imran Rashid commented on SPARK-25422: -- I'm still looking at this, but so far haven't figured it

[jira] [Commented] (SPARK-12978) Skip unnecessary final group-by when input data already clustered with group-by keys

2018-09-19 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620605#comment-16620605 ] Takeshi Yamamuro commented on SPARK-12978: -- yea, I think so and we need to discuss more about

[jira] [Commented] (SPARK-25466) Documentation does not specify how to set fafka consumer cache capacity for SS

2018-09-19 Thread Patrick McGloin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620575#comment-16620575 ] Patrick McGloin commented on SPARK-25466: - [~tdas] I am happy to update the doc or adjust the

[jira] [Created] (SPARK-25466) Documentation does not specify how to set fafka consumer cache capacity for SS

2018-09-19 Thread Patrick McGloin (JIRA)
Patrick McGloin created SPARK-25466: --- Summary: Documentation does not specify how to set fafka consumer cache capacity for SS Key: SPARK-25466 URL: https://issues.apache.org/jira/browse/SPARK-25466

[jira] [Created] (SPARK-25465) Refactor Parquet test suites in project Hive

2018-09-19 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-25465: -- Summary: Refactor Parquet test suites in project Hive Key: SPARK-25465 URL: https://issues.apache.org/jira/browse/SPARK-25465 Project: Spark Issue Type:

***UNCHECKED*** [jira] [Updated] (SPARK-25464) Dropping database can remove the hive warehouse directory contents

2018-09-19 Thread Sushanta Sen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanta Sen updated SPARK-25464: - Description: Create Database. CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] db_name [COMMENT

[jira] [Resolved] (SPARK-25358) MutableProjection supports fallback to an interpreted mode

2018-09-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25358. - Resolution: Fixed Fix Version/s: 2.5.0 Issue resolved by pull request 22355

[jira] [Assigned] (SPARK-25358) MutableProjection supports fallback to an interpreted mode

2018-09-19 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25358: --- Assignee: Takeshi Yamamuro > MutableProjection supports fallback to an interpreted mode >

[jira] [Commented] (SPARK-25464) Dropping database can remove the hive warehouse directory contents

2018-09-19 Thread sandeep katta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620489#comment-16620489 ] sandeep katta commented on SPARK-25464: --- I have raised the PR 

[jira] [Commented] (SPARK-24777) Add write benchmark for AVRO

2018-09-19 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620488#comment-16620488 ] Gengliang Wang commented on SPARK-24777: PR for this task:

[jira] [Updated] (SPARK-25464) Dropping database can remove the hive warehouse directory contents

2018-09-19 Thread Sushanta Sen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sushanta Sen updated SPARK-25464: - Description: Create Database. CREATE (DATABASE|SCHEMA) [IF NOT EXISTS] db_name [COMMENT

***UNCHECKED*** [jira] [Created] (SPARK-25464) Dropping database can remove the hive warehouse directory contents

2018-09-19 Thread Sushanta Sen (JIRA)
Sushanta Sen created SPARK-25464: Summary: Dropping database can remove the hive warehouse directory contents Key: SPARK-25464 URL: https://issues.apache.org/jira/browse/SPARK-25464 Project: Spark

[jira] [Commented] (SPARK-20937) Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide

2018-09-19 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620476#comment-16620476 ] Yuming Wang commented on SPARK-20937: - User 'seancxmao' has created a pull request for this issue:

***UNCHECKED*** [jira] [Commented] (SPARK-20937) Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide

2018-09-19 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620468#comment-16620468 ] Takeshi Yamamuro commented on SPARK-20937: -- +1, too. Do you have time to write these docs?

***UNCHECKED*** [jira] [Commented] (SPARK-25462) hive on spark - got a weird output when count(*) from this script

2018-09-19 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620460#comment-16620460 ] Takeshi Yamamuro commented on SPARK-25462: -- You should first ask in the spark-user mailing

[jira] [Resolved] (SPARK-25462) hive on spark - got a weird output when count(*) from this script

2018-09-19 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro resolved SPARK-25462. -- Resolution: Invalid > hive on spark - got a weird output when count(*) from this

[jira] [Commented] (SPARK-25331) Structured Streaming File Sink duplicates records in case of driver failure

2018-09-19 Thread Mihaly Toth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620454#comment-16620454 ] Mihaly Toth commented on SPARK-25331: - I have updated the PR with a potential solution and removed

[jira] [Created] (SPARK-25463) Make sure single expression can parse sort order

2018-09-19 Thread Herman van Hovell (JIRA)
Herman van Hovell created SPARK-25463: - Summary: Make sure single expression can parse sort order Key: SPARK-25463 URL: https://issues.apache.org/jira/browse/SPARK-25463 Project: Spark

[jira] [Commented] (SPARK-23899) Built-in SQL Function Improvement

2018-09-19 Thread Georg Heiler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620230#comment-16620230 ] Georg Heiler commented on SPARK-23899: -- What about repartitioning by complex types, i.e. size of

[jira] [Updated] (SPARK-25462) hive on spark - got a weird output when count(*) from this script

2018-09-19 Thread Gu Yuchen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gu Yuchen updated SPARK-25462: -- Environment: spark 1.6.2 hive 1.2.2 hadoop 2.7.1 was: spark 1.6.1 hive 1.2.2 hadoop 2.7.1 >

[jira] [Reopened] (SPARK-25462) hive on spark - got a weird output when count(*) from this script

2018-09-19 Thread Gu Yuchen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gu Yuchen reopened SPARK-25462: --- please help me out with this. thanks a lot  > hive on spark - got a weird output when count(*) from

[jira] [Commented] (SPARK-25454) Division between operands with negative scale can cause precision loss

2018-09-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620192#comment-16620192 ] Marco Gaido commented on SPARK-25454: - [~bersprockets] you're right, the only "wrong" thing of your

[jira] [Moved] (SPARK-25462) hive on spark - got a weird output when count(*) from this script

2018-09-19 Thread Gu Yuchen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gu Yuchen moved HIVE-20592 to SPARK-25462: -- Shepherd: Jeremy Affects Version/s: 1.6.2 Component/s:

[jira] [Commented] (SPARK-25452) Query with where clause is giving unexpected result in case of float column

2018-09-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25452?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620156#comment-16620156 ] Hyukjin Kwon commented on SPARK-25452: -- Thanks. I will appreciate if this can be identified as a