[jira] [Updated] (SPARK-25427) Add BloomFilter creation test cases

2018-09-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25427: -- Component/s: Tests > Add BloomFilter creation test cases >

[jira] [Resolved] (SPARK-25418) The metadata of DataSource table should not include Hive-generated storage properties.

2018-09-13 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-25418. - Resolution: Fixed Fix Version/s: 3.0.0 > The metadata of DataSource table should not include

[jira] [Assigned] (SPARK-25418) The metadata of DataSource table should not include Hive-generated storage properties.

2018-09-13 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-25418: --- Assignee: Takuya Ueshin > The metadata of DataSource table should not include Hive-generated

[jira] [Created] (SPARK-25427) Add BloomFilter creation test cases

2018-09-13 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-25427: - Summary: Add BloomFilter creation test cases Key: SPARK-25427 URL: https://issues.apache.org/jira/browse/SPARK-25427 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-24498) Add JDK compiler for runtime codegen

2018-09-13 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-24498: Target Version/s: 3.0.0 > Add JDK compiler for runtime codegen > > >

[jira] [Commented] (SPARK-23906) Add UDF trunc(numeric)

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614307#comment-16614307 ] Yuming Wang commented on SPARK-23906: - cc [~dongjoon] It's difficult to reuse {{trunc}} for

[jira] [Created] (SPARK-25426) Handles subexpression elimination config inside CodeGeneratorWithInterpretedFallback

2018-09-13 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-25426: Summary: Handles subexpression elimination config inside CodeGeneratorWithInterpretedFallback Key: SPARK-25426 URL: https://issues.apache.org/jira/browse/SPARK-25426

[jira] [Updated] (SPARK-25414) make it clear that the numRows metrics should be counted for each scan of the source

2018-09-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-25414: Summary: make it clear that the numRows metrics should be counted for each scan of the source

[jira] [Updated] (SPARK-25414) make it clear that the numRows metrics should be counted for each scan of the source

2018-09-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-25414: Issue Type: Test (was: Bug) > make it clear that the numRows metrics should be counted for each

[jira] [Comment Edited] (SPARK-25293) Dataframe write to csv saves part files in outputDireotry/task-xx/part-xxx instead of directly saving in outputDir

2018-09-13 Thread omkar puttagunta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614245#comment-16614245 ] omkar puttagunta edited comment on SPARK-25293 at 9/14/18 2:03 AM: ---

[jira] [Commented] (SPARK-25293) Dataframe write to csv saves part files in outputDireotry/task-xx/part-xxx instead of directly saving in outputDir

2018-09-13 Thread omkar puttagunta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614245#comment-16614245 ] omkar puttagunta commented on SPARK-25293: -- [~hyukjin.kwon] tested with 2.1.3, got the  same

[jira] [Updated] (SPARK-25293) Dataframe write to csv saves part files in outputDireotry/task-xx/part-xxx instead of directly saving in outputDir

2018-09-13 Thread omkar puttagunta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] omkar puttagunta updated SPARK-25293: - Affects Version/s: 2.1.3 > Dataframe write to csv saves part files in

[jira] [Comment Edited] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614208#comment-16614208 ] Stavros Kontopoulos edited comment on SPARK-23200 at 9/14/18 1:07 AM:

[jira] [Comment Edited] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614208#comment-16614208 ] Stavros Kontopoulos edited comment on SPARK-23200 at 9/14/18 1:04 AM:

[jira] [Comment Edited] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614208#comment-16614208 ] Stavros Kontopoulos edited comment on SPARK-23200 at 9/14/18 1:03 AM:

[jira] [Comment Edited] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614208#comment-16614208 ] Stavros Kontopoulos edited comment on SPARK-23200 at 9/14/18 1:03 AM:

[jira] [Commented] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614208#comment-16614208 ] Stavros Kontopoulos commented on SPARK-23200: - [~cloud_fan]I think this should go in 2.4

[jira] [Updated] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-23200: Issue Type: Bug (was: Improvement) > Reset configuration when restarting from checkpoints >

[jira] [Commented] (SPARK-25378) ArrayData.toArray(StringType) assume UTF8String in 2.4

2018-09-13 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614200#comment-16614200 ] Liang-Chi Hsieh commented on SPARK-25378: - The fix looks like:

[jira] [Comment Edited] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614196#comment-16614196 ] Stavros Kontopoulos edited comment on SPARK-23200 at 9/14/18 12:45 AM:

[jira] [Comment Edited] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614196#comment-16614196 ] Stavros Kontopoulos edited comment on SPARK-23200 at 9/14/18 12:44 AM:

[jira] [Commented] (SPARK-23200) Reset configuration when restarting from checkpoints

2018-09-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614196#comment-16614196 ] Stavros Kontopoulos commented on SPARK-23200: - This is important and should have been a bug

[jira] [Commented] (SPARK-25344) Break large tests.py files into smaller files

2018-09-13 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614176#comment-16614176 ] Bryan Cutler commented on SPARK-25344: -- >From the mailing list I think we should agree on a few

[jira] [Commented] (SPARK-25053) Allow additional port forwarding on Spark on K8S as needed

2018-09-13 Thread Stavros Kontopoulos (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614111#comment-16614111 ] Stavros Kontopoulos commented on SPARK-25053: - This is going to be covered by the pod

[jira] [Updated] (SPARK-25423) Output "dataFilters" in DataSourceScanExec.metadata

2018-09-13 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-25423: Labels: starter (was: ) > Output "dataFilters" in DataSourceScanExec.metadata >

[jira] [Commented] (SPARK-25164) Parquet reader builds entire list of columns once for each column

2018-09-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614020#comment-16614020 ] Ruslan Dautkhanov commented on SPARK-25164: --- Hi [~bersprockets]   Thanks a lot for the

[jira] [Comment Edited] (SPARK-25164) Parquet reader builds entire list of columns once for each column

2018-09-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614020#comment-16614020 ] Ruslan Dautkhanov edited comment on SPARK-25164 at 9/13/18 8:19 PM:

[jira] [Created] (SPARK-25425) Extra options must overwrite sessions options

2018-09-13 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-25425: -- Summary: Extra options must overwrite sessions options Key: SPARK-25425 URL: https://issues.apache.org/jira/browse/SPARK-25425 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-21291) R bucketBy partitionBy API

2018-09-13 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613994#comment-16613994 ] Felix Cheung commented on SPARK-21291: -- No, you wouldn’t return a writer in R. I will reply with

[jira] [Assigned] (SPARK-25400) Increase timeouts in schedulerIntegrationSuite

2018-09-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-25400: - Assignee: Imran Rashid > Increase timeouts in schedulerIntegrationSuite >

[jira] [Resolved] (SPARK-25400) Increase timeouts in schedulerIntegrationSuite

2018-09-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25400. --- Resolution: Fixed Fix Version/s: 2.3.2 2.4.0 Issue resolved by pull

[jira] [Resolved] (SPARK-25338) Several tests miss calling super.afterAll() in their afterAll() method

2018-09-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-25338. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22337

[jira] [Assigned] (SPARK-25338) Several tests miss calling super.afterAll() in their afterAll() method

2018-09-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-25338: - Assignee: Kazuaki Ishizaki > Several tests miss calling super.afterAll() in their

[jira] [Updated] (SPARK-25424) Window duration and slide duration with negative values should fail fast

2018-09-13 Thread Raghav Kumar Gautam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghav Kumar Gautam updated SPARK-25424: Fix Version/s: (was: 2.3.2) 2.4.0 > Window duration and

[jira] [Updated] (SPARK-25424) Window duration and slide duration with negative values should fail fast

2018-09-13 Thread Raghav Kumar Gautam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghav Kumar Gautam updated SPARK-25424: Description: In TimeWindow class window duration and slide duration should not be

[jira] [Commented] (SPARK-21291) R bucketBy partitionBy API

2018-09-13 Thread Huaxin Gao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613832#comment-16613832 ] Huaxin Gao commented on SPARK-21291: [~felixcheung] I am working on this, but not sure if my

[jira] [Commented] (SPARK-25424) Window duration and slide duration with negative values should fail fast

2018-09-13 Thread Raghav Kumar Gautam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613830#comment-16613830 ] Raghav Kumar Gautam commented on SPARK-25424: - I have a patch for this. Can someone assign

[jira] [Updated] (SPARK-25424) Window duration and slide duration with negative values should fail fast

2018-09-13 Thread Raghav Kumar Gautam (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Raghav Kumar Gautam updated SPARK-25424: Target Version/s: 2.4.0 (was: 2.3.2) > Window duration and slide duration with

[jira] [Created] (SPARK-25424) Window duration and slide duration with negative values should fail fast

2018-09-13 Thread Raghav Kumar Gautam (JIRA)
Raghav Kumar Gautam created SPARK-25424: --- Summary: Window duration and slide duration with negative values should fail fast Key: SPARK-25424 URL: https://issues.apache.org/jira/browse/SPARK-25424

[jira] [Updated] (SPARK-25423) Output "dataFilters" in DataSourceScanExec.metadata

2018-09-13 Thread Maryann Xue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maryann Xue updated SPARK-25423: Summary: Output "dataFilters" in DataSourceScanExec.metadata (was: Output "dataFilters" in

[jira] [Resolved] (SPARK-25406) Incorrect usage of withSQLConf method in Parquet schema pruning test suite masks failing tests

2018-09-13 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai resolved SPARK-25406. - Resolution: Fixed Fix Version/s: 2.4.0 3.0.0 Issue resolved by pull request

[jira] [Assigned] (SPARK-25406) Incorrect usage of withSQLConf method in Parquet schema pruning test suite masks failing tests

2018-09-13 Thread DB Tsai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DB Tsai reassigned SPARK-25406: --- Assignee: Michael Allman > Incorrect usage of withSQLConf method in Parquet schema pruning test

[jira] [Created] (SPARK-25423) Output "dataFilters" in DataSourceScanExec.toString

2018-09-13 Thread Maryann Xue (JIRA)
Maryann Xue created SPARK-25423: --- Summary: Output "dataFilters" in DataSourceScanExec.toString Key: SPARK-25423 URL: https://issues.apache.org/jira/browse/SPARK-25423 Project: Spark Issue

[jira] [Assigned] (SPARK-25170) Add Task Metrics description to the documentation

2018-09-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-25170: - Assignee: Luca Canali > Add Task Metrics description to the documentation >

[jira] [Resolved] (SPARK-25170) Add Task Metrics description to the documentation

2018-09-13 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25170. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 22397

[jira] [Updated] (SPARK-25407) Spark throws a `ParquetDecodingException` when attempting to read a field from a complex type in certain cases of schema merging

2018-09-13 Thread Michael Allman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Allman updated SPARK-25407: --- Description: Spark supports merging schemata across table partitions in which one partition

[jira] [Commented] (SPARK-25422) flaky test: org.apache.spark.DistributedSuite.caching on disk, replicated (encryption = on) (with replication as stream)

2018-09-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613575#comment-16613575 ] Wenchen Fan commented on SPARK-25422: - cc [~squito] is it related with the 2GB limitation change? >

[jira] [Created] (SPARK-25422) flaky test: org.apache.spark.DistributedSuite.caching on disk, replicated (encryption = on) (with replication as stream)

2018-09-13 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-25422: --- Summary: flaky test: org.apache.spark.DistributedSuite.caching on disk, replicated (encryption = on) (with replication as stream) Key: SPARK-25422 URL:

[jira] [Updated] (SPARK-25402) Null handling in BooleanSimplification

2018-09-13 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25402: -- Fix Version/s: 2.2.3 > Null handling in BooleanSimplification >

[jira] [Updated] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread huanghuai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huanghuai updated SPARK-25420: -- Priority: Trivial (was: Major) > Dataset.count() every time is different. >

[jira] [Updated] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread huanghuai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huanghuai updated SPARK-25420: -- Priority: Major (was: Trivial) > Dataset.count() every time is different. >

[jira] [Updated] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread huanghuai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huanghuai updated SPARK-25420: -- Issue Type: Question (was: Bug) > Dataset.count() every time is different. >

[jira] [Commented] (SPARK-25404) Staging path may not on the expected place when table path contains the stagingDir string

2018-09-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613369#comment-16613369 ] Apache Spark commented on SPARK-25404: -- User 'fjh100456' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25404) Staging path may not on the expected place when table path contains the stagingDir string

2018-09-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25404: Assignee: (was: Apache Spark) > Staging path may not on the expected place when

[jira] [Assigned] (SPARK-25404) Staging path may not on the expected place when table path contains the stagingDir string

2018-09-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25404: Assignee: Apache Spark > Staging path may not on the expected place when table path

[jira] [Commented] (SPARK-25404) Staging path may not on the expected place when table path contains the stagingDir string

2018-09-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613368#comment-16613368 ] Apache Spark commented on SPARK-25404: -- User 'fjh100456' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25421) Abstract an output path field in trait DataWritingCommand

2018-09-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25421: Assignee: Apache Spark > Abstract an output path field in trait DataWritingCommand >

[jira] [Commented] (SPARK-25421) Abstract an output path field in trait DataWritingCommand

2018-09-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613327#comment-16613327 ] Apache Spark commented on SPARK-25421: -- User 'LantaoJin' has created a pull request for this issue:

[jira] [Commented] (SPARK-25421) Abstract an output path field in trait DataWritingCommand

2018-09-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613328#comment-16613328 ] Apache Spark commented on SPARK-25421: -- User 'LantaoJin' has created a pull request for this issue:

[jira] [Assigned] (SPARK-25421) Abstract an output path field in trait DataWritingCommand

2018-09-13 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-25421: Assignee: (was: Apache Spark) > Abstract an output path field in trait

[jira] [Created] (SPARK-25421) Abstract an output path field in trait DataWritingCommand

2018-09-13 Thread Lantao Jin (JIRA)
Lantao Jin created SPARK-25421: -- Summary: Abstract an output path field in trait DataWritingCommand Key: SPARK-25421 URL: https://issues.apache.org/jira/browse/SPARK-25421 Project: Spark Issue

[jira] [Commented] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613217#comment-16613217 ] Marco Gaido commented on SPARK-25420: - I think the reason here is that since we don't enforce any

[jira] [Comment Edited] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613209#comment-16613209 ] Marco Gaido edited comment on SPARK-25420 at 9/13/18 8:51 AM: -- Please do

[jira] [Updated] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25420: Labels: SQL (was: SQL correctness) > Dataset.count() every time is different. >

[jira] [Commented] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613209#comment-16613209 ] Marco Gaido commented on SPARK-25420: - Please do not use Critical/Blocker as they are reserved for

[jira] [Updated] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25420: Labels: SQL correctness (was: ) > Dataset.count() every time is different. >

[jira] [Updated] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25420: Priority: Major (was: Critical) > Dataset.count() every time is different. >

[jira] [Created] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread huanghuai (JIRA)
huanghuai created SPARK-25420: - Summary: Dataset.count() every time is different. Key: SPARK-25420 URL: https://issues.apache.org/jira/browse/SPARK-25420 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-25378) ArrayData.toArray(StringType) assume UTF8String in 2.4

2018-09-13 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613157#comment-16613157 ] Liang-Chi Hsieh edited comment on SPARK-25378 at 9/13/18 8:33 AM: -- I

[jira] [Commented] (SPARK-25412) FeatureHasher would change the value of output feature

2018-09-13 Thread Vincent (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613181#comment-16613181 ] Vincent commented on SPARK-25412: - Thanks, Nick,  for the reply. so, the tradeoff is between highly

[jira] [Reopened] (SPARK-24538) ByteArrayDecimalType support push down to parquet data sources

2018-09-13 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-24538: -- > ByteArrayDecimalType support push down to parquet data sources >

[jira] [Resolved] (SPARK-24538) ByteArrayDecimalType support push down to parquet data sources

2018-09-13 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24538. -- Resolution: Duplicate > ByteArrayDecimalType support push down to parquet data sources >

[jira] [Resolved] (SPARK-24549) Support DecimalType push down to the parquet data sources

2018-09-13 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24549. -- Resolution: Fixed Fixed in https://github.com/apache/spark/pull/21556 > Support DecimalType

[jira] [Reopened] (SPARK-24549) Support DecimalType push down to the parquet data sources

2018-09-13 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-24549: -- > Support DecimalType push down to the parquet data sources >

[jira] [Updated] (SPARK-24538) ByteArrayDecimalType support push down to parquet data sources

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24538: Issue Type: Sub-task (was: Improvement) Parent: SPARK-25419 > ByteArrayDecimalType

[jira] [Commented] (SPARK-24538) ByteArrayDecimalType support push down to parquet data sources

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613170#comment-16613170 ] Yuming Wang commented on SPARK-24538: - [~cloud_fan] Could you please update this ticket to 

[jira] [Updated] (SPARK-24538) ByteArrayDecimalType support push down to parquet data sources

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24538: Issue Type: Improvement (was: Sub-task) Parent: (was: SPARK-25419) >

[jira] [Comment Edited] (SPARK-25378) ArrayData.toArray(StringType) assume UTF8String in 2.4

2018-09-13 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613157#comment-16613157 ] Liang-Chi Hsieh edited comment on SPARK-25378 at 9/13/18 7:59 AM: -- I

[jira] [Resolved] (SPARK-25412) FeatureHasher would change the value of output feature

2018-09-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-25412. Resolution: Not A Bug > FeatureHasher would change the value of output feature >

[jira] [Commented] (SPARK-25412) FeatureHasher would change the value of output feature

2018-09-13 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613160#comment-16613160 ] Nick Pentreath commented on SPARK-25412: (1) is by design. Feature hashing does not store the

[jira] [Updated] (SPARK-24549) Support DecimalType push down to the parquet data sources

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24549: Fix Version/s: 2.4.0 > Support DecimalType push down to the parquet data sources >

[jira] [Commented] (SPARK-25378) ArrayData.toArray(StringType) assume UTF8String in 2.4

2018-09-13 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613157#comment-16613157 ] Liang-Chi Hsieh commented on SPARK-25378: - I think a quick fix is to use general `get` method

[jira] [Updated] (SPARK-24538) ByteArrayDecimalType support push down to parquet data sources

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24538: Fix Version/s: (was: 2.4.0) > ByteArrayDecimalType support push down to parquet data sources

[jira] [Updated] (SPARK-25207) Case-insensitve field resolution for filter pushdown when reading Parquet

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-25207: Issue Type: Sub-task (was: Bug) Parent: SPARK-25419 > Case-insensitve field resolution

[jira] [Updated] (SPARK-17091) Convert IN predicate to equivalent Parquet filter

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-17091: Affects Version/s: 2.4.0 Component/s: SQL Issue Type: Sub-task (was: Bug)

[jira] [Resolved] (SPARK-25419) Parquet predicate pushdown improvement

2018-09-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-25419. - Resolution: Fixed Fix Version/s: 2.4.0 > Parquet predicate pushdown improvement >

[jira] [Updated] (SPARK-24716) Refactor ParquetFilters

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24716: Issue Type: Sub-task (was: Improvement) Parent: SPARK-25419 > Refactor ParquetFilters >

[jira] [Updated] (SPARK-24718) Timestamp support pushdown to parquet data source

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24718: Issue Type: Sub-task (was: Improvement) Parent: SPARK-25419 > Timestamp support pushdown

[jira] [Updated] (SPARK-24549) Support DecimalType push down to the parquet data sources

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24549: Issue Type: Sub-task (was: Improvement) Parent: SPARK-25419 > Support DecimalType push

[jira] [Updated] (SPARK-24638) StringStartsWith support push down

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24638: Issue Type: Sub-task (was: Improvement) Parent: SPARK-25419 > StringStartsWith support

[jira] [Updated] (SPARK-24706) Support ByteType and ShortType pushdown to parquet

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24706: Issue Type: Sub-task (was: Improvement) Parent: SPARK-25419 > Support ByteType and

[jira] [Updated] (SPARK-23727) Support DATE predict push down in parquet

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-23727: Issue Type: Sub-task (was: Improvement) Parent: SPARK-25419 > Support DATE predict push

[jira] [Updated] (SPARK-24538) ByteArrayDecimalType support push down to parquet data sources

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-24538: Issue Type: Sub-task (was: Improvement) Parent: SPARK-25419 > ByteArrayDecimalType

[jira] [Created] (SPARK-25419) Parquet predicate pushdown improvement

2018-09-13 Thread Yuming Wang (JIRA)
Yuming Wang created SPARK-25419: --- Summary: Parquet predicate pushdown improvement Key: SPARK-25419 URL: https://issues.apache.org/jira/browse/SPARK-25419 Project: Spark Issue Type: Umbrella

[jira] [Commented] (SPARK-24538) ByteArrayDecimalType support push down to parquet data sources

2018-09-13 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613135#comment-16613135 ] Yuming Wang commented on SPARK-24538: - [~cloud_fan] OK, I will do it. > ByteArrayDecimalType

[jira] [Commented] (SPARK-20937) Describe spark.sql.parquet.writeLegacyFormat property in Spark SQL, DataFrames and Datasets Guide

2018-09-13 Thread Sergei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613113#comment-16613113 ] Sergei commented on SPARK-20937: do you remember what did you do with it finally? > Describe

[jira] [Commented] (SPARK-24538) ByteArrayDecimalType support push down to parquet data sources

2018-09-13 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613105#comment-16613105 ] Wenchen Fan commented on SPARK-24538: - [~yumwang] can you create an umbrella JIRA ticket for all