[jira] [Assigned] (SPARK-27384) File source V2: Prune unnecessary partition columns

2019-04-08 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-27384: --- Assignee: Gengliang Wang > File source V2: Prune unnecessary partition columns > --

[jira] [Resolved] (SPARK-27384) File source V2: Prune unnecessary partition columns

2019-04-08 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-27384. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24296 [https://gith

[jira] [Commented] (SPARK-27289) spark-submit explicit configuration does not take effect but Spark UI shows it's effective

2019-04-08 Thread Udbhav Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812242#comment-16812242 ] Udbhav Agrawal commented on SPARK-27289: yes intermediate data is written in the

[jira] [Commented] (SPARK-27406) UnsafeArrayData serialization breaks when two machines have different Oops size

2019-04-08 Thread Sandeep Katta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812302#comment-16812302 ] Sandeep Katta commented on SPARK-27406: --- [~pengbo] thanks for raising this issue,

[jira] [Commented] (SPARK-27406) UnsafeArrayData serialization breaks when two machines have different Oops size

2019-04-08 Thread peng bo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812376#comment-16812376 ] peng bo commented on SPARK-27406: - [~sandeep.katta2007]  Actually, I have already submi

[jira] [Commented] (SPARK-27348) HeartbeatReceiver doesn't remove lost executors from CoarseGrainedSchedulerBackend

2019-04-08 Thread Sandeep Katta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812396#comment-16812396 ] Sandeep Katta commented on SPARK-27348: --- [~zsxwing] do you have any test code or s

[jira] [Created] (SPARK-27407) File source V2: Invalidate cache data on overwrite/append

2019-04-08 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-27407: -- Summary: File source V2: Invalidate cache data on overwrite/append Key: SPARK-27407 URL: https://issues.apache.org/jira/browse/SPARK-27407 Project: Spark

[jira] [Resolved] (SPARK-25407) Spark throws a `ParquetDecodingException` when attempting to read a field from a complex type in certain cases of schema merging

2019-04-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-25407. -- Resolution: Fixed Assignee: Dongjoon Hyun Fix Version/s: 3.0.0 Fixed in https:

[jira] [Commented] (SPARK-16548) java.io.CharConversionException: Invalid UTF-32 character prevents me from querying my data

2019-04-08 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812469#comment-16812469 ] Wenchen Fan commented on SPARK-16548: - Do you have a small dateset to reproduce it?

[jira] [Commented] (SPARK-27364) User-facing APIs for GPU-aware scheduling

2019-04-08 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812488#comment-16812488 ] Thomas Graves commented on SPARK-27364: --- There are 3 main user facing impacts for

[jira] [Comment Edited] (SPARK-27364) User-facing APIs for GPU-aware scheduling

2019-04-08 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812488#comment-16812488 ] Thomas Graves edited comment on SPARK-27364 at 4/8/19 3:01 PM: ---

[jira] [Comment Edited] (SPARK-27364) User-facing APIs for GPU-aware scheduling

2019-04-08 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812488#comment-16812488 ] Thomas Graves edited comment on SPARK-27364 at 4/8/19 3:10 PM: ---

[jira] [Created] (SPARK-27408) functions.coalesce working on csv but not on Mongospark

2019-04-08 Thread yashwanth (JIRA)
yashwanth created SPARK-27408: - Summary: functions.coalesce working on csv but not on Mongospark Key: SPARK-27408 URL: https://issues.apache.org/jira/browse/SPARK-27408 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-27176) Upgrade hadoop-3's built-in Hive maven dependencies to 2.3.4

2019-04-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-27176. - Resolution: Fixed Assignee: Yuming Wang Fix Version/s: 3.0.0 > Upgrade hadoop-3's built-

[jira] [Resolved] (SPARK-13704) TaskSchedulerImpl.createTaskSetManager can be expensive, and result in lost executors due to blocked heartbeats

2019-04-08 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-13704. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24245 [https://gi

[jira] [Assigned] (SPARK-13704) TaskSchedulerImpl.createTaskSetManager can be expensive, and result in lost executors due to blocked heartbeats

2019-04-08 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid reassigned SPARK-13704: Assignee: Lantao Jin > TaskSchedulerImpl.createTaskSetManager can be expensive, and resul

[jira] [Updated] (SPARK-23710) Upgrade the built-in Hive to 2.3.4 for hadoop-3.2

2019-04-08 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-23710: Target Version/s: 3.0.0 > Upgrade the built-in Hive to 2.3.4 for hadoop-3.2 >

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812628#comment-16812628 ] shane knapp commented on SPARK-27389: - JDKs haven't changed on the jenkins workers i

[jira] [Commented] (SPARK-27348) HeartbeatReceiver doesn't remove lost executors from CoarseGrainedSchedulerBackend

2019-04-08 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812657#comment-16812657 ] Shixiong Zhu commented on SPARK-27348: -- [~sandeep.katta2007] I cannot reproduce thi

[jira] [Created] (SPARK-27409) Micro-batch support for Kafka Source in Spark 2.3

2019-04-08 Thread Prabhjot Singh Bharaj (JIRA)
Prabhjot Singh Bharaj created SPARK-27409: - Summary: Micro-batch support for Kafka Source in Spark 2.3 Key: SPARK-27409 URL: https://issues.apache.org/jira/browse/SPARK-27409 Project: Spark

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812704#comment-16812704 ] shane knapp commented on SPARK-27389: - is this even really a valid timezone? plus,

[jira] [Comment Edited] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812704#comment-16812704 ] shane knapp edited comment on SPARK-27389 at 4/8/19 6:56 PM: -

[jira] [Assigned] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp reassigned SPARK-27389: --- Assignee: (was: shane knapp) > pyspark test failures w/ "UnknownTimeZoneError: 'US/Paci

[jira] [Commented] (SPARK-25079) [PYTHON] upgrade python 3.4 -> 3.6

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812756#comment-16812756 ] shane knapp commented on SPARK-25079: - waiting on [~bryanc] to release pyarrow 0.12.

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812765#comment-16812765 ] Sean Owen commented on SPARK-27389: --- On the question of what the heck it is, comically

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812767#comment-16812767 ] Sean Owen commented on SPARK-27389: --- What about updating tzdata? > pyspark test failu

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812784#comment-16812784 ] shane knapp commented on SPARK-27389: - well, this started happening ~6am PST on apri

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812785#comment-16812785 ] shane knapp commented on SPARK-27389: - [~srowen] sure, i can update the tzdata packa

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812789#comment-16812789 ] shane knapp commented on SPARK-27389: - updating tzdata didn't do anything noticeable

[jira] [Comment Edited] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812789#comment-16812789 ] shane knapp edited comment on SPARK-27389 at 4/8/19 9:03 PM: -

[jira] [Comment Edited] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812789#comment-16812789 ] shane knapp edited comment on SPARK-27389 at 4/8/19 9:09 PM: -

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812802#comment-16812802 ] Sean Owen commented on SPARK-27389: --- I wonder what has created /usr/share/zoneinfo/US/

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812838#comment-16812838 ] shane knapp commented on SPARK-27389: - well, according to [~bryanc]: """ >From the

[jira] [Comment Edited] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812841#comment-16812841 ] shane knapp edited comment on SPARK-27389 at 4/8/19 10:06 PM:

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812841#comment-16812841 ] shane knapp commented on SPARK-27389: - also, java8 appears to believe i'm in the US/

[jira] [Comment Edited] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812841#comment-16812841 ] shane knapp edited comment on SPARK-27389 at 4/8/19 10:07 PM:

[jira] [Updated] (SPARK-16548) java.io.CharConversionException: Invalid UTF-32 character prevents me from querying my data

2019-04-08 Thread Bijith Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bijith Kumar updated SPARK-16548: - Attachment: corrupted.json > java.io.CharConversionException: Invalid UTF-32 character prevents

[jira] [Commented] (SPARK-16548) java.io.CharConversionException: Invalid UTF-32 character prevents me from querying my data

2019-04-08 Thread Bijith Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812858#comment-16812858 ] Bijith Kumar commented on SPARK-16548: -- [~cloud_fan], I couldn't find the specific 

[jira] [Created] (SPARK-27410) Remove deprecated/no-op mllib.Kmeans get/setRuns methods

2019-04-08 Thread Sean Owen (JIRA)
Sean Owen created SPARK-27410: - Summary: Remove deprecated/no-op mllib.Kmeans get/setRuns methods Key: SPARK-27410 URL: https://issues.apache.org/jira/browse/SPARK-27410 Project: Spark Issue Type

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812877#comment-16812877 ] Bryan Cutler commented on SPARK-27389: -- [~shaneknapp], I had a couple of successful

[jira] [Assigned] (SPARK-25407) Spark throws a `ParquetDecodingException` when attempting to read a field from a complex type in certain cases of schema merging

2019-04-08 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reassigned SPARK-25407: - Assignee: Michael Allman (was: Dongjoon Hyun) > Spark throws a `ParquetDecodingExcepti

[jira] [Commented] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812883#comment-16812883 ] shane knapp commented on SPARK-27389: - no, it appears to be random. [https://amplab

[jira] [Comment Edited] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812883#comment-16812883 ] shane knapp edited comment on SPARK-27389 at 4/9/19 12:05 AM:

[jira] [Comment Edited] (SPARK-27389) pyspark test failures w/ "UnknownTimeZoneError: 'US/Pacific-New'"

2019-04-08 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812883#comment-16812883 ] shane knapp edited comment on SPARK-27389 at 4/9/19 12:21 AM:

[jira] [Assigned] (SPARK-26881) Scaling issue with Gramian computation for RowMatrix: too many results sent to driver

2019-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26881: - Assignee: Rafael RENAUDIN-AVINO > Scaling issue with Gramian computation for RowMatrix: too man

[jira] [Resolved] (SPARK-26881) Scaling issue with Gramian computation for RowMatrix: too many results sent to driver

2019-04-08 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26881. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23983 [https://github.c

[jira] [Commented] (SPARK-27409) Micro-batch support for Kafka Source in Spark 2.3

2019-04-08 Thread Shivu Sondur (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812991#comment-16812991 ] Shivu Sondur commented on SPARK-27409: -- i am checking this > Micro-batch support f

[jira] [Assigned] (SPARK-27328) Create 'deprecate' property in ExpressionDescription for SQL functions documentation

2019-04-08 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-27328: --- Assignee: Hyukjin Kwon > Create 'deprecate' property in ExpressionDescription for SQL funct

[jira] [Resolved] (SPARK-27328) Create 'deprecate' property in ExpressionDescription for SQL functions documentation

2019-04-08 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-27328. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 24259 [https://gith

[jira] [Created] (SPARK-27411) DataSourceV2Strategy should not eliminate subquery

2019-04-08 Thread Mingcong Han (JIRA)
Mingcong Han created SPARK-27411: Summary: DataSourceV2Strategy should not eliminate subquery Key: SPARK-27411 URL: https://issues.apache.org/jira/browse/SPARK-27411 Project: Spark Issue Type

[jira] [Created] (SPARK-27412) Add a new shuffle manager to use Persistent Memory as shuffle and spilling storage

2019-04-08 Thread Chendi.Xue (JIRA)
Chendi.Xue created SPARK-27412: -- Summary: Add a new shuffle manager to use Persistent Memory as shuffle and spilling storage Key: SPARK-27412 URL: https://issues.apache.org/jira/browse/SPARK-27412 Projec

[jira] [Updated] (SPARK-27412) Add a new shuffle manager to use Persistent Memory as shuffle and spilling storage

2019-04-08 Thread Chendi.Xue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chendi.Xue updated SPARK-27412: --- External issue URL: https://github.com/apache/spark/pull/24322 > Add a new shuffle manager to use Pe

[jira] [Updated] (SPARK-27412) Add a new shuffle manager to use Persistent Memory as shuffle and spilling storage

2019-04-08 Thread Chendi.Xue (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chendi.Xue updated SPARK-27412: --- External issue URL: (was: https://github.com/apache/spark/pull/24322) > Add a new shuffle manager