[jira] [Commented] (SPARK-26709) OptimizeMetadataOnlyQuery does not correctly handle the files with zero record

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751975#comment-16751975 ] Apache Spark commented on SPARK-26709: -- User 'gengliangwang' has created a pull request for this

[jira] [Commented] (SPARK-26709) OptimizeMetadataOnlyQuery does not correctly handle the files with zero record

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751974#comment-16751974 ] Apache Spark commented on SPARK-26709: -- User 'gengliangwang' has created a pull request for this

[jira] [Assigned] (SPARK-26712) Disk broken causing YarnShuffleSerivce not available

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26712: Assignee: (was: Apache Spark) > Disk broken causing YarnShuffleSerivce not available

[jira] [Assigned] (SPARK-26712) Disk broken causing YarnShuffleSerivce not available

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26712: Assignee: Apache Spark > Disk broken causing YarnShuffleSerivce not available >

[jira] [Commented] (SPARK-26708) Incorrect result caused by inconsistency between a SQL cache's cached RDD and its physical plan

2019-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751927#comment-16751927 ] Dongjoon Hyun commented on SPARK-26708: --- Hi, [~smilegator]. Is this only related to Spark 2.4.0?

[jira] [Assigned] (SPARK-26725) Fix the input values of UnifiedMemoryManager constructor in test suites

2019-01-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26725: - Assignee: Sean Owen > Fix the input values of UnifiedMemoryManager constructor in test suites

[jira] [Assigned] (SPARK-26725) Fix the input values of UnifiedMemoryManager constructor in test suites

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26725: Assignee: Sean Owen (was: Apache Spark) > Fix the input values of UnifiedMemoryManager

[jira] [Assigned] (SPARK-26725) Fix the input values of UnifiedMemoryManager constructor in test suites

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26725: Assignee: Apache Spark (was: Sean Owen) > Fix the input values of UnifiedMemoryManager

[jira] [Updated] (SPARK-26725) Fix the input values of UnifiedMemoryManager constructor in test suites

2019-01-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26725: -- Priority: Minor (was: Major) > Fix the input values of UnifiedMemoryManager constructor in test

[jira] [Commented] (SPARK-26677) Incorrect results of not(eqNullSafe) when data read from Parquet file

2019-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751890#comment-16751890 ] Dongjoon Hyun commented on SPARK-26677: --- Thank you, [~anandchinn] and [~hyukjin.kwon]. So,

[jira] [Assigned] (SPARK-26708) Incorrect result caused by inconsistency between a SQL cache's cached RDD and its physical plan

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26708: Assignee: Apache Spark (was: Maryann Xue) > Incorrect result caused by inconsistency

[jira] [Assigned] (SPARK-26708) Incorrect result caused by inconsistency between a SQL cache's cached RDD and its physical plan

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26708: Assignee: Maryann Xue (was: Apache Spark) > Incorrect result caused by inconsistency

[jira] [Updated] (SPARK-25767) Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library

2019-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-25767: -- Fix Version/s: 2.3.3 > Error reported in Spark logs when using the >

[jira] [Resolved] (SPARK-26649) Noop Streaming Sink using DSV2

2019-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved SPARK-26649. --- Resolution: Fixed Assignee: Gabor Somogyi Fix Version/s: 3.0.0 This is

[jira] [Updated] (SPARK-26680) StackOverflowError if Stream passed to groupBy

2019-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-26680: -- Fix Version/s: 2.3.3 > StackOverflowError if Stream passed to groupBy >

[jira] [Assigned] (SPARK-25713) Implement copy() for ColumnarArray

2019-01-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-25713: --- Assignee: Artsiom Yudovin > Implement copy() for ColumnarArray >

[jira] [Commented] (SPARK-26654) Use Timestamp/DateFormatter in CatalogColumnStat

2019-01-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751855#comment-16751855 ] Wenchen Fan commented on SPARK-26654: - +1. I think we store string format instead of the actual long

[jira] [Commented] (SPARK-26569) Fixed point for batch Operator Optimizations never reached when optimize logicalPlan

2019-01-24 Thread Chen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751801#comment-16751801 ] Chen Fan commented on SPARK-26569: -- I believe PR under

[jira] [Created] (SPARK-26725) Fix the input values of UnifiedMemoryManager constructor in test suites

2019-01-24 Thread Xiao Li (JIRA)
Xiao Li created SPARK-26725: --- Summary: Fix the input values of UnifiedMemoryManager constructor in test suites Key: SPARK-26725 URL: https://issues.apache.org/jira/browse/SPARK-26725 Project: Spark

[jira] [Updated] (SPARK-26725) Fix the input values of UnifiedMemoryManager constructor in test suites

2019-01-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-26725: Labels: starter (was: ) > Fix the input values of UnifiedMemoryManager constructor in test suites >

[jira] [Commented] (SPARK-26688) Provide configuration of initially blacklisted YARN nodes

2019-01-24 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751830#comment-16751830 ] Imran Rashid commented on SPARK-26688: -- OK that is a reasonable request ... but to play devil's

[jira] [Resolved] (SPARK-26569) Fixed point for batch Operator Optimizations never reached when optimize logicalPlan

2019-01-24 Thread Chen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chen Fan resolved SPARK-26569. -- Resolution: Duplicate > Fixed point for batch Operator Optimizations never reached when optimize >

[jira] [Assigned] (SPARK-23674) Add Spark ML Listener for Tracking ML Pipeline Status

2019-01-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-23674: Assignee: Hyukjin Kwon > Add Spark ML Listener for Tracking ML Pipeline Status >

[jira] [Resolved] (SPARK-23674) Add Spark ML Listener for Tracking ML Pipeline Status

2019-01-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-23674. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23263

[jira] [Commented] (SPARK-24579) SPIP: Standardize Optimized Data Exchange between Spark and DL/AI frameworks

2019-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751802#comment-16751802 ] Bryan Cutler commented on SPARK-24579: -- It would be great to start up this discussion again, I saw

[jira] [Commented] (SPARK-26412) Allow Pandas UDF to take an iterator of pd.DataFrames for the entire partition

2019-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26412?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751785#comment-16751785 ] Bryan Cutler commented on SPARK-26412: -- [~mengxr] I think Arrow record batches would be a much more

[jira] [Commented] (SPARK-26410) Support per Pandas UDF configuration

2019-01-24 Thread Bryan Cutler (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751771#comment-16751771 ] Bryan Cutler commented on SPARK-26410: -- This could be useful to have, but it does seem a little

[jira] [Resolved] (SPARK-19591) Add sample weights to decision trees

2019-01-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19591. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21632

[jira] [Commented] (SPARK-26677) Incorrect results of not(eqNullSafe) when data read from Parquet file

2019-01-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751767#comment-16751767 ] Hyukjin Kwon commented on SPARK-26677: -- Yes, please read the linked PR above. > Incorrect results

[jira] [Resolved] (SPARK-26187) Stream-stream left outer join returns outer nulls for already matched rows

2019-01-24 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved SPARK-26187. -- Resolution: Duplicate > Stream-stream left outer join returns outer nulls for already matched

[jira] [Commented] (SPARK-26677) Incorrect results of not(eqNullSafe) when data read from Parquet file

2019-01-24 Thread ANAND CHINNAKANNAN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751757#comment-16751757 ] ANAND CHINNAKANNAN commented on SPARK-26677: [~hyukjin.kwon] - Do you know exactly the issue

[jira] [Created] (SPARK-26724) Non negative coefficients for LinearRegression

2019-01-24 Thread Alex Chang (JIRA)
Alex Chang created SPARK-26724: -- Summary: Non negative coefficients for LinearRegression Key: SPARK-26724 URL: https://issues.apache.org/jira/browse/SPARK-26724 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-13587) Support virtualenv in PySpark

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13587: Assignee: Apache Spark > Support virtualenv in PySpark > - >

[jira] [Assigned] (SPARK-13587) Support virtualenv in PySpark

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13587: Assignee: (was: Apache Spark) > Support virtualenv in PySpark >

[jira] [Assigned] (SPARK-13587) Support virtualenv in PySpark

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-13587: -- Assignee: (was: Marcelo Vanzin) > Support virtualenv in PySpark >

[jira] [Assigned] (SPARK-13587) Support virtualenv in PySpark

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-13587: -- Assignee: Marcelo Vanzin (was: Jeff Zhang) > Support virtualenv in PySpark >

[jira] [Assigned] (SPARK-26697) ShuffleBlockFetcherIterator can log block sizes in addition to num blocks

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-26697: -- Assignee: Imran Rashid > ShuffleBlockFetcherIterator can log block sizes in addition

[jira] [Updated] (SPARK-26723) Spark web UI only shows parts of SQL query graphs for queries with persist operations

2019-01-24 Thread Vladimir Matveev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Matveev updated SPARK-26723: - Attachment: Screen Shot 2019-01-24 at 4.13.14 PM.png Screen Shot

[jira] [Resolved] (SPARK-26697) ShuffleBlockFetcherIterator can log block sizes in addition to num blocks

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26697. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23621

[jira] [Created] (SPARK-26723) Spark web UI only shows parts of SQL query graphs for queries with persist operations

2019-01-24 Thread Vladimir Matveev (JIRA)
Vladimir Matveev created SPARK-26723: Summary: Spark web UI only shows parts of SQL query graphs for queries with persist operations Key: SPARK-26723 URL: https://issues.apache.org/jira/browse/SPARK-26723

[jira] [Commented] (SPARK-26187) Stream-stream left outer join returns outer nulls for already matched rows

2019-01-24 Thread Pavel Chernikov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751708#comment-16751708 ] Pavel Chernikov commented on SPARK-26187: - [~kabhwan], I'm definitely okay with that. Feel free

[jira] [Commented] (SPARK-26718) structured streaming fetched wrong current offset from kafka

2019-01-24 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751706#comment-16751706 ] Gabor Somogyi commented on SPARK-26718: --- +1 on [~kabhwan] suggestion > structured streaming

[jira] [Assigned] (SPARK-26530) Validate heartheat arguments in HeartbeatReceiver

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-26530: -- Assignee: liupengcheng > Validate heartheat arguments in HeartbeatReceiver >

[jira] [Updated] (SPARK-26721) Bug in feature importance calculation in GBM (and possibly other decision tree classifiers)

2019-01-24 Thread Daniel Jumper (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Jumper updated SPARK-26721: -- Priority: Blocker (was: Critical) > Bug in feature importance calculation in GBM (and

[jira] [Updated] (SPARK-26721) Bug in feature importance calculation in GBM (and possibly other decision tree classifiers)

2019-01-24 Thread Daniel Jumper (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Jumper updated SPARK-26721: -- Description: The feature importance calculation in

[jira] [Commented] (SPARK-26718) structured streaming fetched wrong current offset from kafka

2019-01-24 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751694#comment-16751694 ] Jungtaek Lim commented on SPARK-26718: -- [~linehrr] Thanks for the analysis. I think allowing

[jira] [Updated] (SPARK-26721) Bug in feature importance calculation in GBM (and possibly other decision tree classifiers)

2019-01-24 Thread Daniel Jumper (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Jumper updated SPARK-26721: -- Description: The feature importance calculation in

[jira] [Created] (SPARK-26722) add SPARK_TEST_KEY=1 to pull request builder and spark-master-test-sbt-hadoop-2.7

2019-01-24 Thread shane knapp (JIRA)
shane knapp created SPARK-26722: --- Summary: add SPARK_TEST_KEY=1 to pull request builder and spark-master-test-sbt-hadoop-2.7 Key: SPARK-26722 URL: https://issues.apache.org/jira/browse/SPARK-26722

[jira] [Commented] (SPARK-26721) Bug in feature importance calculation in GBM (and possibly other decision tree classifiers)

2019-01-24 Thread Daniel Jumper (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751691#comment-16751691 ] Daniel Jumper commented on SPARK-26721: --- updated the priority to Blocker as this is a correctness

[jira] [Resolved] (SPARK-26530) Validate heartheat arguments in HeartbeatReceiver

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26530. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23445

[jira] [Created] (SPARK-26721) Bug in feature importance calculation in GBM (and possibly other decision tree classifiers)

2019-01-24 Thread Daniel Jumper (JIRA)
Daniel Jumper created SPARK-26721: - Summary: Bug in feature importance calculation in GBM (and possibly other decision tree classifiers) Key: SPARK-26721 URL: https://issues.apache.org/jira/browse/SPARK-26721

[jira] [Assigned] (SPARK-26720) Remove unused methods from DateTimeUtils

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26720: Assignee: (was: Apache Spark) > Remove unused methods from DateTimeUtils >

[jira] [Assigned] (SPARK-26720) Remove unused methods from DateTimeUtils

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26720: Assignee: Apache Spark > Remove unused methods from DateTimeUtils >

[jira] [Commented] (SPARK-26187) Stream-stream left outer join returns outer nulls for already matched rows

2019-01-24 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751672#comment-16751672 ] Jungtaek Lim commented on SPARK-26187: -- Thanks [~ChernikovP], your example helped much to track

[jira] [Commented] (SPARK-26154) Stream-stream joins - left outer join gives inconsistent output

2019-01-24 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751670#comment-16751670 ] Jungtaek Lim commented on SPARK-26154: -- As issue reporter concerns about handling duplicated issue,

[jira] [Created] (SPARK-26720) Remove unused methods from DateTimeUtils

2019-01-24 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-26720: -- Summary: Remove unused methods from DateTimeUtils Key: SPARK-26720 URL: https://issues.apache.org/jira/browse/SPARK-26720 Project: Spark Issue Type: Improvement

[jira] [Assigned] (SPARK-26154) Stream-stream joins - left outer join gives inconsistent output

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26154: Assignee: Apache Spark > Stream-stream joins - left outer join gives inconsistent output

[jira] [Commented] (SPARK-26154) Stream-stream joins - left outer join gives inconsistent output

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751665#comment-16751665 ] Apache Spark commented on SPARK-26154: -- User 'HeartSaVioR' has created a pull request for this

[jira] [Assigned] (SPARK-26154) Stream-stream joins - left outer join gives inconsistent output

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26154: Assignee: (was: Apache Spark) > Stream-stream joins - left outer join gives

[jira] [Commented] (SPARK-26711) JSON Schema inference takes 15 times longer

2019-01-24 Thread Bruce Robbins (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751655#comment-16751655 ] Bruce Robbins commented on SPARK-26711: --- Re: 7 minutes vs. 50 seconds: Looking at the code, it

[jira] [Commented] (SPARK-26682) Task attempt ID collision causes lost data

2019-01-24 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751651#comment-16751651 ] Shixiong Zhu commented on SPARK-26682: -- For future reference, data loss could happen when one task

[jira] [Updated] (SPARK-26682) Task attempt ID collision causes lost data

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26682: --- Fix Version/s: 2.3.3 > Task attempt ID collision causes lost data >

[jira] [Commented] (SPARK-26654) Use Timestamp/DateFormatter in CatalogColumnStat

2019-01-24 Thread Maxim Gekk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751634#comment-16751634 ] Maxim Gekk commented on SPARK-26654: [~cloud_fan][~hvanhovell][~srowen] I do believe saving

[jira] [Commented] (SPARK-26718) structured streaming fetched wrong current offset from kafka

2019-01-24 Thread Ryne Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751629#comment-16751629 ] Ryne Yang commented on SPARK-26718: --- A simple fix would be a if statement to check if the integer

[jira] [Commented] (SPARK-26718) structured streaming fetched wrong current offset from kafka

2019-01-24 Thread Ryne Yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751625#comment-16751625 ] Ryne Yang commented on SPARK-26718: --- found the issue, it's the rateLimit calculation. if anyone set

[jira] [Commented] (SPARK-25767) Error reported in Spark logs when using the org.apache.spark:spark-sql_2.11:2.3.2 Java library

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751605#comment-16751605 ] Apache Spark commented on SPARK-25767: -- User 'bersprockets' has created a pull request for this

[jira] [Commented] (SPARK-26608) Remove Jenkins jobs for `branch-2.2`

2019-01-24 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751597#comment-16751597 ] Dongjoon Hyun commented on SPARK-26608: --- Thank you, [~shaneknapp]! :D > Remove Jenkins jobs for

[jira] [Resolved] (SPARK-26608) Remove Jenkins jobs for `branch-2.2`

2019-01-24 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] shane knapp resolved SPARK-26608. - Resolution: Fixed > Remove Jenkins jobs for `branch-2.2` >

[jira] [Commented] (SPARK-26608) Remove Jenkins jobs for `branch-2.2`

2019-01-24 Thread shane knapp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751588#comment-16751588 ] shane knapp commented on SPARK-26608: - alright, config changes are merged... i'm going to delete

[jira] [Assigned] (SPARK-26719) Get rid of java.util.Calendar in DateTimeUtils

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26719: Assignee: Apache Spark > Get rid of java.util.Calendar in DateTimeUtils >

[jira] [Assigned] (SPARK-26719) Get rid of java.util.Calendar in DateTimeUtils

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26719: Assignee: (was: Apache Spark) > Get rid of java.util.Calendar in DateTimeUtils >

[jira] [Created] (SPARK-26719) Get rid of java.util.Calendar in DateTimeUtils

2019-01-24 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-26719: -- Summary: Get rid of java.util.Calendar in DateTimeUtils Key: SPARK-26719 URL: https://issues.apache.org/jira/browse/SPARK-26719 Project: Spark Issue Type:

[jira] [Created] (SPARK-26718) structured streaming fetched wrong current offset from kafka

2019-01-24 Thread Ryne Yang (JIRA)
Ryne Yang created SPARK-26718: - Summary: structured streaming fetched wrong current offset from kafka Key: SPARK-26718 URL: https://issues.apache.org/jira/browse/SPARK-26718 Project: Spark

[jira] [Commented] (SPARK-25590) kubernetes-model-2.0.0.jar masks default Spark logging config

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751425#comment-16751425 ] Marcelo Vanzin commented on SPARK-25590: I filed

[jira] [Commented] (SPARK-18484) case class datasets - ability to specify decimal precision and scale

2019-01-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751422#comment-16751422 ] Marco Gaido commented on SPARK-18484: - [~bonazzaf] please do not delete comments, as they may be

[jira] [Resolved] (SPARK-26687) Building Spark Images has non-intuitive behaviour with paths to custom Dockerfiles

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26687. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23613

[jira] [Assigned] (SPARK-26687) Building Spark Images has non-intuitive behaviour with paths to custom Dockerfiles

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-26687: -- Assignee: Rob Vesse > Building Spark Images has non-intuitive behaviour with paths

[jira] [Resolved] (SPARK-26717) Support PodPriority for spark driver and executor on kubernetes

2019-01-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26717. Resolution: Not A Problem Pretty sure this is covered by pod templates (SPARK-24434). >

[jira] [Created] (SPARK-26717) Support PodPriority for spark driver and executor on kubernetes

2019-01-24 Thread Li Gao (JIRA)
Li Gao created SPARK-26717: -- Summary: Support PodPriority for spark driver and executor on kubernetes Key: SPARK-26717 URL: https://issues.apache.org/jira/browse/SPARK-26717 Project: Spark Issue

[jira] [Assigned] (SPARK-26690) Checkpoints of Dataframes are not visible in the SQL UI

2019-01-24 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell reassigned SPARK-26690: - Assignee: Tom van Bussel > Checkpoints of Dataframes are not visible in the

[jira] [Resolved] (SPARK-26690) Checkpoints of Dataframes are not visible in the SQL UI

2019-01-24 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-26690. --- Resolution: Fixed Fix Version/s: 3.0.0 > Checkpoints of Dataframes are not

[jira] [Commented] (SPARK-18484) case class datasets - ability to specify decimal precision and scale

2019-01-24 Thread Franco Bonazza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751174#comment-16751174 ] Franco Bonazza commented on SPARK-18484: What if you have a DataFrame with higher precision e.g.

[jira] [Assigned] (SPARK-26716) Refactor supportDataType API: the supported types of read/write should be consistent

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26716: Assignee: (was: Apache Spark) > Refactor supportDataType API: the supported types

[jira] [Commented] (SPARK-26699) Dataset column output discrepancies

2019-01-24 Thread Praveena (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751159#comment-16751159 ] Praveena commented on SPARK-26699: -- I am trying to understand why its behaving differently on Local and

[jira] [Assigned] (SPARK-26716) Refactor supportDataType API: the supported types of read/write should be consistent

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26716: Assignee: Apache Spark > Refactor supportDataType API: the supported types of

[jira] [Created] (SPARK-26716) Refactor supportDataType API: the supported types of read/write should be consistent

2019-01-24 Thread Gengliang Wang (JIRA)
Gengliang Wang created SPARK-26716: -- Summary: Refactor supportDataType API: the supported types of read/write should be consistent Key: SPARK-26716 URL: https://issues.apache.org/jira/browse/SPARK-26716

[jira] [Assigned] (SPARK-26713) PipedRDD may holds stdin writer and stdout read threads even if the task is finished

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26713: Assignee: Apache Spark > PipedRDD may holds stdin writer and stdout read threads even if

[jira] [Assigned] (SPARK-26713) PipedRDD may holds stdin writer and stdout read threads even if the task is finished

2019-01-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26713: Assignee: (was: Apache Spark) > PipedRDD may holds stdin writer and stdout read

[jira] [Commented] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2019-01-24 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751082#comment-16751082 ] Gabor Somogyi commented on SPARK-26389: --- I've lowered the prio and will file a PR for this soon.

[jira] [Resolved] (SPARK-26715) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown

2019-01-24 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-26715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Antal Bálint Steinbach resolved SPARK-26715. Resolution: Invalid YARN issue. > If linux container executor is not set

[jira] [Updated] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown

2019-01-24 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated SPARK-26389: -- Priority: Minor (was: Major) > temp checkpoint folder at executor should be deleted on

[jira] [Created] (SPARK-26715) If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown

2019-01-24 Thread JIRA
Antal Bálint Steinbach created SPARK-26715: -- Summary: If linux container executor is not set for a GPU cluster GpuResourceHandlerImpl is not initialized and NPE is thrown Key: SPARK-26715 URL:

[jira] [Comment Edited] (SPARK-17333) Make pyspark interface friendly with static analysis

2019-01-24 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751040#comment-16751040 ] Maciej Szymkiewicz edited comment on SPARK-17333 at 1/24/19 11:58 AM:

[jira] [Commented] (SPARK-17333) Make pyspark interface friendly with static analysis

2019-01-24 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751040#comment-16751040 ] Maciej Szymkiewicz commented on SPARK-17333: [~Alexander_Gorokhov] Personally I maintain

[jira] [Comment Edited] (SPARK-17333) Make pyspark interface friendly with static analysis

2019-01-24 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751040#comment-16751040 ] Maciej Szymkiewicz edited comment on SPARK-17333 at 1/24/19 11:54 AM:

[jira] [Comment Edited] (SPARK-17333) Make pyspark interface friendly with static analysis

2019-01-24 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751040#comment-16751040 ] Maciej Szymkiewicz edited comment on SPARK-17333 at 1/24/19 11:53 AM:

[jira] [Comment Edited] (SPARK-17333) Make pyspark interface friendly with static analysis

2019-01-24 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751040#comment-16751040 ] Maciej Szymkiewicz edited comment on SPARK-17333 at 1/24/19 11:53 AM:

[jira] [Commented] (SPARK-25713) Implement copy() for ColumnarArray

2019-01-24 Thread Artsiom Yudovin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751028#comment-16751028 ] Artsiom Yudovin commented on SPARK-25713: - [~cloud_fan]  > Implement copy() for ColumnarArray >

[jira] [Updated] (SPARK-26710) ImageSchemaSuite has some errors when running it in local laptop

2019-01-24 Thread xubo245 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-26710: Attachment: wx20190124-192...@2x.png wx20190124-192...@2x.png > ImageSchemaSuite has some

[jira] [Updated] (SPARK-26710) ImageSchemaSuite has some errors when running it in local laptop

2019-01-24 Thread xubo245 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] xubo245 updated SPARK-26710: Description: ImageSchemaSuite and org.apache.spark.ml.source.image.ImageFileFormatSuite has some errors

  1   2   >