[jira] [Created] (SPARK-26866) Support kinesis checkpoint with subSequenceNumber

2019-02-12 Thread TaeYoung Kim (JIRA)
TaeYoung Kim created SPARK-26866: Summary: Support kinesis checkpoint with subSequenceNumber Key: SPARK-26866 URL: https://issues.apache.org/jira/browse/SPARK-26866 Project: Spark Issue

[jira] [Assigned] (SPARK-26866) Support kinesis checkpoint with subSequenceNumber

2019-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26866: Assignee: (was: Apache Spark) > Support kinesis checkpoint with subSequenceNumber >

[jira] [Assigned] (SPARK-26866) Support kinesis checkpoint with subSequenceNumber

2019-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26866: Assignee: Apache Spark > Support kinesis checkpoint with subSequenceNumber >

[jira] [Updated] (SPARK-26866) Support kinesis checkpoint with subSequenceNumber

2019-02-12 Thread TaeYoung Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] TaeYoung Kim updated SPARK-26866: - Description: aws kinesis producer library

[jira] [Assigned] (SPARK-26865) DataSourceV2Strategy should push normalized filters

2019-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26865: Assignee: (was: Apache Spark) > DataSourceV2Strategy should push normalized filters

[jira] [Assigned] (SPARK-26865) DataSourceV2Strategy should push normalized filters

2019-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26865: Assignee: Apache Spark > DataSourceV2Strategy should push normalized filters >

[jira] [Updated] (SPARK-26865) DataSourceV2Strategy should push normalized filters

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26865: -- Summary: DataSourceV2Strategy should push normalized filters (was: SupportsPushDownFilters

[jira] [Updated] (SPARK-26865) SupportsPushDownFilters should push normalized filters

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26865: -- Summary: SupportsPushDownFilters should push normalized filters (was:

[jira] [Commented] (SPARK-26865) SupportsPushDownFilters should push the same filters with DSv1

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766819#comment-16766819 ] Dongjoon Hyun commented on SPARK-26865: --- I'll make a PR for this. > SupportsPushDownFilters

[jira] [Updated] (SPARK-26865) SupportsPushDownFilters should push the same filters with DSv1

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26865: -- Summary: SupportsPushDownFilters should push the same filters with DSv1 (was: DSv2

[jira] [Updated] (SPARK-26865) DSv2 SupportsPushDownFilters should push the same filters with DSv1

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26865: -- Summary: DSv2 SupportsPushDownFilters should push the same filters with DSv1 (was: ORC

[jira] [Updated] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26865: -- Description: Although we designed `SupportsPushDownFilters` in the same way by using

[jira] [Commented] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766802#comment-16766802 ] Dongjoon Hyun commented on SPARK-26865: --- [~cloud_fan]. I updated the issue description. The root

[jira] [Updated] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26865: -- Description: DSv1 and DSv2 passes different filters. And, DSv2 doesn't guarantee that filter

[jira] [Commented] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766787#comment-16766787 ] Dongjoon Hyun commented on SPARK-26865: --- This happens in DSv2 and is caused by SPARK-23817. > ORC

[jira] [Updated] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26865: -- Affects Version/s: (was: 2.4.0) 3.0.0 > ORC filter pushdown should

[jira] [Commented] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766785#comment-16766785 ] Dongjoon Hyun commented on SPARK-26865: --- Oh, it's on master branch. I'm investigating it. > ORC

[jira] [Comment Edited] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766781#comment-16766781 ] Dongjoon Hyun edited comment on SPARK-26865 at 2/13/19 5:11 AM: Hi,

[jira] [Commented] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766779#comment-16766779 ] Dongjoon Hyun commented on SPARK-26865: --- Thank you for pinging me. I'll take a look. > ORC filter

[jira] [Comment Edited] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766781#comment-16766781 ] Dongjoon Hyun edited comment on SPARK-26865 at 2/13/19 5:11 AM: Hi,

[jira] [Commented] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766781#comment-16766781 ] Dongjoon Hyun commented on SPARK-26865: --- Hi, [~cloud_fan]. The following is the result from the

[jira] [Comment Edited] (SPARK-24211) Flaky test: StreamingOuterJoinSuite

2019-02-12 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766756#comment-16766756 ] Takeshi Yamamuro edited comment on SPARK-24211 at 2/13/19 4:10 AM: ---

[jira] [Commented] (SPARK-24239) Flaky test: KafkaContinuousSourceSuite.subscribing topic by name from earliest offsets

2019-02-12 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766757#comment-16766757 ] Takeshi Yamamuro commented on SPARK-24239: -- Based on the discussion

[jira] [Commented] (SPARK-24211) Flaky test: StreamingOuterJoinSuite

2019-02-12 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766756#comment-16766756 ] Takeshi Yamamuro commented on SPARK-24211: -- Based on the discussion

[jira] [Updated] (SPARK-23491) continuous symptom

2019-02-12 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-23491: - Fix Version/s: 2.3.4 > continuous symptom > -- > > Key:

[jira] [Updated] (SPARK-23416) Flaky test: KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false

2019-02-12 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-23416: - Fix Version/s: 2.3.4 > Flaky test: KafkaSourceStressForDontFailOnDataLossSuite.stress

[jira] [Updated] (SPARK-24239) Flaky test: KafkaContinuousSourceSuite.subscribing topic by name from earliest offsets

2019-02-12 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro updated SPARK-24239: - Affects Version/s: (was: 2.4.0) 2.3.2 > Flaky test:

[jira] [Resolved] (SPARK-26761) Vectorized gapply, Arrow optimization in native R function execution

2019-02-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26761. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23746

[jira] [Commented] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766717#comment-16766717 ] Wenchen Fan commented on SPARK-26865: - cc [~dongjoon] [~Gengliang.Wang] [~LI,Xiao] > ORC filter

[jira] [Commented] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-02-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766698#comment-16766698 ] Hyukjin Kwon commented on SPARK-26777: -- There are already bunches of issues open (2500+). Leaving

[jira] [Created] (SPARK-26865) ORC filter pushdown should be case insensitive by default

2019-02-12 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-26865: --- Summary: ORC filter pushdown should be case insensitive by default Key: SPARK-26865 URL: https://issues.apache.org/jira/browse/SPARK-26865 Project: Spark

[jira] [Resolved] (SPARK-26240) [pyspark] Updating illegal column names with withColumnRenamed does not change schema changes, causing pyspark.sql.utils.AnalysisException

2019-02-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26240. -- Resolution: Incomplete Resolving this due to no feedback from reporter. > [pyspark] Updating

[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0

2019-02-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766710#comment-16766710 ] Hyukjin Kwon commented on SPARK-23534: -- It's blocked by Hive upgrade. We should resolve that first

[jira] [Commented] (SPARK-26860) RangeBetween docs appear to be wrong

2019-02-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766708#comment-16766708 ] Hyukjin Kwon commented on SPARK-26860: -- It doesn't need to assign someone. Just make a PR and that

[jira] [Commented] (SPARK-26016) Encoding not working when using a map / mapPartitions call

2019-02-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766690#comment-16766690 ] Hyukjin Kwon commented on SPARK-26016: -- Okay, can you make the input / output in JIRA's

[jira] [Resolved] (SPARK-26320) udf with multiple arrays as input

2019-02-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26320. -- Resolution: Incomplete Leaving this resolved due to no feedback. > udf with multiple arrays

[jira] [Commented] (SPARK-26703) Hive record writer will always depends on parquet-1.6 writer should fix it

2019-02-12 Thread zhoukang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766682#comment-16766682 ] zhoukang commented on SPARK-26703: -- As far as i know [~hyukjin.kwon] hive is still depend on twitter

[jira] [Comment Edited] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-02-12 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766653#comment-16766653 ] Jungtaek Lim edited comment on SPARK-26777 at 2/13/19 2:13 AM: --- I'd rather

[jira] [Comment Edited] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-02-12 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766653#comment-16766653 ] Jungtaek Lim edited comment on SPARK-26777 at 2/13/19 2:12 AM: --- I'd rather

[jira] [Comment Edited] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-02-12 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766653#comment-16766653 ] Jungtaek Lim edited comment on SPARK-26777 at 2/13/19 1:56 AM: --- I'd rather

[jira] [Commented] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-02-12 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766653#comment-16766653 ] Jungtaek Lim commented on SPARK-26777: -- I'd rather concern about EMR vs vanilla (Apache version),

[jira] [Resolved] (SPARK-26857) Return UnsafeArrayData for date/timestamp type in ColumnarArray.copy()

2019-02-12 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro resolved SPARK-26857. -- Resolution: Fixed Assignee: Gengliang Wang Fix Version/s: 3.0.0

[jira] [Assigned] (SPARK-26864) Query may return incorrect result when python udf is used as a join condition and the udf uses attributes from both legs of left semi join.

2019-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26864: Assignee: (was: Apache Spark) > Query may return incorrect result when python udf is

[jira] [Assigned] (SPARK-26864) Query may return incorrect result when python udf is used as a join condition and the udf uses attributes from both legs of left semi join.

2019-02-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26864: Assignee: Apache Spark > Query may return incorrect result when python udf is used as a

[jira] [Created] (SPARK-26864) Query may return incorrect result when python udf is used as a join condition and the udf uses attributes from both legs of left semi join.

2019-02-12 Thread Dilip Biswal (JIRA)
Dilip Biswal created SPARK-26864: Summary: Query may return incorrect result when python udf is used as a join condition and the udf uses attributes from both legs of left semi join. Key: SPARK-26864 URL:

[jira] [Commented] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-02-12 Thread Yuri Budilov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766613#comment-16766613 ] Yuri Budilov commented on SPARK-26777: -- I split the above query from 1 into 3 and it worked OK on

[jira] [Resolved] (SPARK-26862) assertion failed in ParquetRowConverter

2019-02-12 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu resolved SPARK-26862. - Resolution: Invalid > assertion failed in ParquetRowConverter > ---

[jira] [Commented] (SPARK-21492) Memory leak in SortMergeJoin

2019-02-12 Thread Tao Luo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766598#comment-16766598 ] Tao Luo commented on SPARK-21492: - cc [~tejasp], [~kiszk] for input on code generation to address the

[jira] [Updated] (SPARK-26851) CachedRDDBuilder only partially implements double-checked locking

2019-02-12 Thread Bruce Robbins (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bruce Robbins updated SPARK-26851: -- Description: In CachedRDDBuilder, {{cachedColumnBuffers}} uses double-checked locking to

[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0

2019-02-12 Thread t oo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766577#comment-16766577 ] t oo commented on SPARK-23534: -- time for hadoop 3.2 profile? am curious to know if hadoop3 offers much

[jira] [Comment Edited] (SPARK-24374) SPIP: Support Barrier Execution Mode in Apache Spark

2019-02-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766550#comment-16766550 ] Xiangrui Meng edited comment on SPARK-24374 at 2/12/19 10:54 PM: -

[jira] [Commented] (SPARK-26395) Spark Thrift server memory leak

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766547#comment-16766547 ] Marcelo Vanzin commented on SPARK-26395: The code that cleans up stages does clean up the RDD

[jira] [Resolved] (SPARK-26588) Idle executor should properly be killed when no job is submitted

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26588. Resolution: Duplicate > Idle executor should properly be killed when no job is submitted

[jira] [Commented] (SPARK-24374) SPIP: Support Barrier Execution Mode in Apache Spark

2019-02-12 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766550#comment-16766550 ] Xiangrui Meng commented on SPARK-24374: --- [~luzengxiang] When you create external processes, did

[jira] [Updated] (SPARK-26770) Misleading/unhelpful error message when wrapping a null in an Option

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26770: --- Component/s: (was: Spark Core) SQL > Misleading/unhelpful error

[jira] [Resolved] (SPARK-25917) Spark UI's executors page loads forever when memoryMetrics in None. Fix is to JSON ignore memorymetrics when it is None.

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-25917. Resolution: Cannot Reproduce > Spark UI's executors page loads forever when memoryMetrics

[jira] [Updated] (SPARK-26631) Issue while reading Parquet data from Hadoop Archive files (.har)

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26631: --- Component/s: (was: Spark Core) SQL > Issue while reading Parquet data

[jira] [Updated] (SPARK-25987) StackOverflowError when executing many operations on a table with many columns

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-25987: --- Component/s: (was: Spark Core) SQL > StackOverflowError when executing

[jira] [Updated] (SPARK-26150) __spark_conf__XXX.zip doesn't exist

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26150: --- Component/s: (was: Spark Submit) (was: Spark Core)

[jira] [Updated] (SPARK-26240) [pyspark] Updating illegal column names with withColumnRenamed does not change schema changes, causing pyspark.sql.utils.AnalysisException

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26240: --- Component/s: (was: Spark Core) SQL > [pyspark] Updating illegal column

[jira] [Updated] (SPARK-26325) Interpret timestamp fields in Spark while reading json (timestampFormat)

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26325: --- Component/s: (was: Spark Core) SQL > Interpret timestamp fields in

[jira] [Updated] (SPARK-26320) udf with multiple arrays as input

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26320: --- Component/s: (was: Spark Core) SQL > udf with multiple arrays as input

[jira] [Resolved] (SPARK-26279) Remove unused method in Logging

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26279. Resolution: Won't Fix > Remove unused method in Logging > ---

[jira] [Resolved] (SPARK-26417) Make comments for states available for logging

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26417. Resolution: Won't Fix If you're talking about the constants in {{SparkAppHandle.State}},

[jira] [Updated] (SPARK-26436) Dataframe resulting from a GroupByKey and flatMapGroups operation throws java.lang.UnsupportedException when groupByKey is applied on it.

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26436: --- Component/s: (was: Spark Core) SQL > Dataframe resulting from a

[jira] [Updated] (SPARK-26509) Parquet DELTA_BYTE_ARRAY is not supported in Spark 2.x's Vectorized Reader

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26509: --- Component/s: (was: Spark Core) > Parquet DELTA_BYTE_ARRAY is not supported in Spark

[jira] [Updated] (SPARK-26560) Repeating select on udf function throws analysis exception - function not registered

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26560: --- Component/s: (was: Spark Core) SQL > Repeating select on udf function

[jira] [Updated] (SPARK-26589) proper `median` method for spark dataframe

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26589: --- Component/s: (was: Spark Core) SQL > proper `median` method for spark

[jira] [Updated] (SPARK-26769) partition prunning in inner join

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26769: --- Component/s: (was: Spark Core) SQL > partition prunning in inner join

[jira] [Updated] (SPARK-26800) JDBC - MySQL nullable option is ignored

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26800: --- Component/s: (was: Spark Core) SQL > JDBC - MySQL nullable option is

[jira] [Commented] (SPARK-26855) SparkSubmitSuite fails on a clean build

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766531#comment-16766531 ] Marcelo Vanzin commented on SPARK-26855: Could R-related tests be moved to the "R" module? I

[jira] [Commented] (SPARK-26777) SQL worked in 2.3.2 and fails in 2.4.0

2019-02-12 Thread t oo (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766526#comment-16766526 ] t oo commented on SPARK-26777: -- [~yuri.budilov] - can you try in spark-shell instead of pyspark? Also with

[jira] [Resolved] (SPARK-1502) Spark on Yarn: add config option to not include yarn/mapred cluster classpath

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-1502. --- Resolution: Won't Fix I don't think a new setting is worth it. You can use the

[jira] [Assigned] (SPARK-1502) Spark on Yarn: add config option to not include yarn/mapred cluster classpath

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-1502: - Assignee: (was: Marcelo Vanzin) > Spark on Yarn: add config option to not include

[jira] [Assigned] (SPARK-1502) Spark on Yarn: add config option to not include yarn/mapred cluster classpath

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-1502: - Assignee: Marcelo Vanzin (was: Thomas Graves) > Spark on Yarn: add config option to

[jira] [Resolved] (SPARK-1537) Add integration with Yarn's Application Timeline Server

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-1537. --- Resolution: Won't Fix I don't think it makes sense to integrate with the ATS at this point.

[jira] [Resolved] (SPARK-19649) Spark YARN client throws exception if job succeeds and max-completed-applications=0

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19649. Resolution: Won't Fix There isn't a reliable way to do this in Spark without the RM

[jira] [Comment Edited] (SPARK-26859) Reading ORC files with explicit schema can result in wrong data

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766495#comment-16766495 ] Dongjoon Hyun edited comment on SPARK-26859 at 2/12/19 9:43 PM: Thank

[jira] [Resolved] (SPARK-17667) Make locking fine grained in YarnAllocator#enqueueGetLossReasonRequest

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-17667. Resolution: Won't Fix PR was abandoned. Let's close this one. > Make locking fine

[jira] [Commented] (SPARK-26859) Reading ORC files with explicit schema can result in wrong data

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766495#comment-16766495 ] Dongjoon Hyun commented on SPARK-26859: --- Thank you for reporting, [~ivan.vergiliev]. I understand

[jira] [Updated] (SPARK-26859) Reading ORC files with explicit schema can result in wrong data

2019-02-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26859: -- Priority: Major (was: Blocker) > Reading ORC files with explicit schema can result in wrong

[jira] [Resolved] (SPARK-13852) handle the InterruptedException caused by YARN HA switch

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-13852. Resolution: Won't Fix See discussion in PR. > handle the InterruptedException caused by

[jira] [Resolved] (SPARK-15974) Create a socket on YARN AM start-up

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-15974. Resolution: Won't Fix The way this feature is proposed, I don't agree with it. If

[jira] [Resolved] (SPARK-19941) Spark should not schedule tasks on executors on decommissioning YARN nodes

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19941. Resolution: Duplicate > Spark should not schedule tasks on executors on decommissioning

[jira] [Assigned] (SPARK-17209) Support manual credential updating in the run-time for Spark on YARN

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-17209: -- Assignee: Marcelo Vanzin > Support manual credential updating in the run-time for

[jira] [Assigned] (SPARK-17209) Support manual credential updating in the run-time for Spark on YARN

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-17209: -- Assignee: (was: Marcelo Vanzin) > Support manual credential updating in the

[jira] [Resolved] (SPARK-18898) Exception not failing Scala applications (in yarn)

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-18898. Resolution: Cannot Reproduce I'm very sure this works in cluster mode. In client mode

[jira] [Resolved] (SPARK-15955) Failed Spark application returns with exitcode equals to zero

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-15955. Resolution: Cannot Reproduce I'm pretty sure this has worked reliably for a while. If

[jira] [Resolved] (SPARK-19894) Tasks entirely assigned to one executor on Yarn-cluster mode for default-rack

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19894. Resolution: Won't Fix See discussion in PR. My reading is that it was decided this was an

[jira] [Resolved] (SPARK-21933) Spark Streaming request more executors than excepted without DynamicAllocation

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-21933. Resolution: Won't Fix See discussion in PR. > Spark Streaming request more executors

[jira] [Resolved] (SPARK-22199) Spark Job on YARN fails with executors "Slave registration failed"

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-22199. Resolution: Cannot Reproduce So much has changed since 1.6 that I'll make a guess that

[jira] [Resolved] (SPARK-22341) [2.3.0] cannot run Spark on Yarn when Yarn impersonation is turned off

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-22341. Resolution: Fixed This was fixed, forgot to close. > [2.3.0] cannot run Spark on Yarn

[jira] [Commented] (SPARK-26101) Spark Pipe() executes the external app by yarn username not the current username

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766475#comment-16766475 ] Marcelo Vanzin commented on SPARK-26101: Spark does tell YARN which user it wants. But if YARN

[jira] [Resolved] (SPARK-22760) where driver is stopping, and some executors lost because of YarnSchedulerBackend.stop, then there is a problem.

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-22760. Resolution: Won't Fix See discussion in PR. It's just a misleading exception. Not worth

[jira] [Updated] (SPARK-23408) Flaky test: StreamingOuterJoinSuite.left outer early state exclusion on right

2019-02-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-23408: -- Fix Version/s: 2.3.4 > Flaky test: StreamingOuterJoinSuite.left outer early state exclusion on right

[jira] [Commented] (SPARK-26101) Spark Pipe() executes the external app by yarn username not the current username

2019-02-12 Thread Maziyar PANAHI (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766471#comment-16766471 ] Maziyar PANAHI commented on SPARK-26101: I have workaround this issue as I stated here:

[jira] [Resolved] (SPARK-23497) Sparklyr Applications doesn't disconnect spark driver in client mode

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-23497. Resolution: Invalid See above. > Sparklyr Applications doesn't disconnect spark driver

[jira] [Resolved] (SPARK-24205) java.util.concurrent.locks.LockSupport.parkNanos

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-24205. Resolution: Invalid There isn't enough information here to see what's going on. If you

[jira] [Resolved] (SPARK-24700) AM shutdown terminates streaming application but marks it as successful

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-24700. Resolution: Cannot Reproduce This needs more details, like logs if available. AFAIK the

[jira] [Commented] (SPARK-26650) Yarn Client throws 'ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration'

2019-02-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766454#comment-16766454 ] Marcelo Vanzin commented on SPARK-26650: So nothing is being *thrown* here. It's just an

  1   2   >