[jira] [Created] (SPARK-13939) Kafka createDirectStream not parallelizing properly

2016-03-19 Thread Ben Teeuwen (JIRA)
Ben Teeuwen created SPARK-13939: --- Summary: Kafka createDirectStream not parallelizing properly Key: SPARK-13939 URL: https://issues.apache.org/jira/browse/SPARK-13939 Project: Spark Issue Type:

[jira] [Updated] (SPARK-13978) [GSoC 2016] Build monitoring UI (and infrastructure) for Spark SQL and structured streaming

2016-03-19 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-13978: - Summary: [GSoC 2016] Build monitoring UI (and infrastructure) for Spark SQL and structured streaming (wa

[jira] [Commented] (SPARK-13933) hadoop-2.7 profile's curator version should be 2.7.1

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197217#comment-15197217 ] Sean Owen commented on SPARK-13933: --- Tachyon isn't used in the project anymore in 2.x a

[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Kostas Sakellis (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200252#comment-15200252 ] Kostas Sakellis commented on SPARK-13877: - How is this any different than creatin

[jira] [Commented] (SPARK-13861) TPCDS query 40 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198526#comment-15198526 ] Xiao Li commented on SPARK-13861: - Great job! I am just wondering if only cs_sales_price

[jira] [Created] (SPARK-13943) The behavior of sum(booleantype) in Spark DataFrames is not intuitive

2016-03-19 Thread Wes McKinney (JIRA)
Wes McKinney created SPARK-13943: Summary: The behavior of sum(booleantype) in Spark DataFrames is not intuitive Key: SPARK-13943 URL: https://issues.apache.org/jira/browse/SPARK-13943 Project: Spark

[jira] [Assigned] (SPARK-14011) Enable `LineLength` Java checkstyle rule

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-14011: Assignee: Apache Spark > Enable `LineLength` Java checkstyle rule > --

[jira] [Issue Comment Deleted] (SPARK-13963) Add binary toggle Param to ml.HashingTF

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13963: --- Comment: was deleted (was: Sure, assigned to you.) > Add binary toggle Param to ml.HashingTF

[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200821#comment-15200821 ] Xiao Li commented on SPARK-13865: - The query I posted here is downloaded from the officia

[jira] [Assigned] (SPARK-13938) word2phrase feature created in ML

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13938: Assignee: (was: Apache Spark) > word2phrase feature created in ML > --

[jira] [Updated] (SPARK-14010) ColumnPruning is conflict with PushPredicateThroughProject

2016-03-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14010: --- Description: ColumnPruning will insert a Project before Filter, but > ColumnPruning is conflict with

[jira] [Updated] (SPARK-13979) Killed executor is respawned without AWS keys in standalone spark cluster

2016-03-19 Thread Allen George (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allen George updated SPARK-13979: - Description: I'm having a problem where respawning a failed executor during a job that reads/wri

[jira] [Resolved] (SPARK-13816) Add parameter checks for algorithms in Graphx

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-13816. - Resolution: Fixed Assignee: zhengruifeng Fix Version/s: 2.0.0 > Add parameter che

[jira] [Resolved] (SPARK-13901) We get wrong logdebug information when jump to the next locality level.

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13901. --- Resolution: Fixed Fix Version/s: 1.6.2 2.0.0 Issue resolved by pull request

[jira] [Commented] (SPARK-14014) Replace existing analysis.Catalog with SessionCatalog

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202318#comment-15202318 ] Apache Spark commented on SPARK-14014: -- User 'andrewor14' has created a pull request

[jira] [Updated] (SPARK-13905) Change signature of as.data.frame() to be consistent with the R base package

2016-03-19 Thread Sun Rui (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sun Rui updated SPARK-13905: Description: (was: SparkR provides a method as.data.frame() to collect a SparkR DataFrame into a local

[jira] [Updated] (SPARK-13964) Feature hashing improvements

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13964: --- Priority: Minor (was: Major) > Feature hashing improvements > >

[jira] [Updated] (SPARK-13963) Add binary toggle Param to ml.HashingTF

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-13963: --- Assignee: Bryan Cutler > Add binary toggle Param to ml.HashingTF > --

[jira] [Updated] (SPARK-12789) Support order by position in SQL

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12789: Description: This is to support order by position in SQL, e.g. {noformat} select c1, c2, c3 from t

[jira] [Updated] (SPARK-14010) ColumnPruning is conflict with PushPredicateThroughProject

2016-03-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-14010: --- Description: ColumnPruning will insert a Project before Filter, but PushPredicateThroughProject will

[jira] [Created] (SPARK-13976) do not remove sub-queries added by user when generate SQL

2016-03-19 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-13976: --- Summary: do not remove sub-queries added by user when generate SQL Key: SPARK-13976 URL: https://issues.apache.org/jira/browse/SPARK-13976 Project: Spark Issue

[jira] [Assigned] (SPARK-13951) PySpark ml.pipeline support export/import - nested Piplines

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13951: Assignee: Apache Spark > PySpark ml.pipeline support export/import - nested Piplines > ---

[jira] [Commented] (SPARK-13957) Support group by ordinal in SQL

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203089#comment-15203089 ] Apache Spark commented on SPARK-13957: -- User 'gatorsmile' has created a pull request

[jira] [Assigned] (SPARK-13957) Support group by ordinal in SQL

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13957: Assignee: (was: Apache Spark) > Support group by ordinal in SQL >

[jira] [Assigned] (SPARK-13957) Support group by ordinal in SQL

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13957: Assignee: Apache Spark > Support group by ordinal in SQL > ---

[jira] [Created] (SPARK-13946) PySpark DataFrames allows you to silently use aggregate expressions derived from different table expressions

2016-03-19 Thread Wes McKinney (JIRA)
Wes McKinney created SPARK-13946: Summary: PySpark DataFrames allows you to silently use aggregate expressions derived from different table expressions Key: SPARK-13946 URL: https://issues.apache.org/jira/browse/S

[jira] [Updated] (SPARK-13932) CUBE Query with filter (HAVING) and condition (IF) raises an AnalysisException

2016-03-19 Thread Tien-Dung LE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tien-Dung LE updated SPARK-13932: - Affects Version/s: 2.0.0 > CUBE Query with filter (HAVING) and condition (IF) raises an AnalysisE

[jira] [Commented] (SPARK-13950) Generate code for sort merge left/right outer join

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198162#comment-15198162 ] Apache Spark commented on SPARK-13950: -- User 'davies' has created a pull request for

[jira] [Updated] (SPARK-13982) SparkR - KMeans predict: Output column name of features is an unclear, automatic genetared text

2016-03-19 Thread Narine Kokhlikyan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Narine Kokhlikyan updated SPARK-13982: -- Summary: SparkR - KMeans predict: Output column name of features is an unclear, automat

[jira] [Assigned] (SPARK-13858) TPCDS query 21 returns wrong results compared to TPC official result set

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13858: Assignee: Apache Spark > TPCDS query 21 returns wrong results compared to TPC official res

[jira] [Updated] (SPARK-7992) Hide private classes/objects in in generated Java API doc

2016-03-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-7992: - Assignee: (was: Xiangrui Meng) > Hide private classes/objects in in generated Java API doc > -

[jira] [Updated] (SPARK-13038) PySpark ml.pipeline support export/import - non-nested Pipelines

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13038: -- Summary: PySpark ml.pipeline support export/import - non-nested Pipelines (was: PySpar

[jira] [Created] (SPARK-14009) Fail the tests if the any catalyst rule reach max number of iteration.

2016-03-19 Thread Davies Liu (JIRA)
Davies Liu created SPARK-14009: -- Summary: Fail the tests if the any catalyst rule reach max number of iteration. Key: SPARK-14009 URL: https://issues.apache.org/jira/browse/SPARK-14009 Project: Spark

[jira] [Resolved] (SPARK-13776) Web UI is not available after ./sbin/start-master.sh

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-13776. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11615 [https://github.co

[jira] [Commented] (SPARK-13461) Duplicated example code merge and cleanup

2016-03-19 Thread Xusen Yin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203076#comment-15203076 ] Xusen Yin commented on SPARK-13461: --- Yes we'll delete it. > Duplicated example code me

[jira] [Commented] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197902#comment-15197902 ] Apache Spark commented on SPARK-13937: -- User 'BryanCutler' has created a pull reques

[jira] [Closed] (SPARK-13821) TPC-DS Query 20 fails to compile

2016-03-19 Thread Roy Cecil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Roy Cecil closed SPARK-13821. - Resolution: Not A Problem > TPC-DS Query 20 fails to compile > > >

[jira] [Commented] (SPARK-13761) Deprecate validateParams

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200097#comment-15200097 ] Apache Spark commented on SPARK-13761: -- User 'jkbradley' has created a pull request

[jira] [Comment Edited] (SPARK-13935) Other clients' connection hang up when someone do huge load

2016-03-19 Thread Tao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197554#comment-15197554 ] Tao Wang edited comment on SPARK-13935 at 3/16/16 3:51 PM: --- [~m

[jira] [Updated] (SPARK-13983) HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since 1.6 version (both multi-session and single session)

2016-03-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-13983: --- Assignee: Cheng Lian > HiveThriftServer2 can not get "--hiveconf" or ''--hivevar" variables since >

[jira] [Updated] (SPARK-12789) Support order by position in SQL

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12789: Summary: Support order by position in SQL (was: Support order by position) > Support order by posi

[jira] [Commented] (SPARK-13960) JAR/File HTTP Server doesn't respect "spark.driver.host" and there is no "spark.fileserver.host" option

2016-03-19 Thread Ilya Ostrovskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200690#comment-15200690 ] Ilya Ostrovskiy commented on SPARK-13960: - exporting the SPARK_LOCAL_IP environme

[jira] [Updated] (SPARK-12789) Support order by position in SQL

2016-03-19 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12789?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin updated SPARK-12789: Description: This is to support order by position in SQL, e.g. {noformat} select c1, c2, c3 from t

[jira] [Created] (SPARK-13961) spark.ml ChiSqSelector should support other numeric types for label

2016-03-19 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-13961: -- Summary: spark.ml ChiSqSelector should support other numeric types for label Key: SPARK-13961 URL: https://issues.apache.org/jira/browse/SPARK-13961 Project: Spar

[jira] [Commented] (SPARK-13928) Move org.apache.spark.Logging into org.apache.spark.internal.Logging

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197584#comment-15197584 ] Apache Spark commented on SPARK-13928: -- User 'cloud-fan' has created a pull request

[jira] [Comment Edited] (SPARK-13821) TPC-DS Query 20 fails to compile

2016-03-19 Thread Roy Cecil (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201506#comment-15201506 ] Roy Cecil edited comment on SPARK-13821 at 3/18/16 2:09 PM: D

[jira] [Assigned] (SPARK-13993) PySpark ml.feature.RFormula/RFormulaModel support export/import

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13993: Assignee: Apache Spark > PySpark ml.feature.RFormula/RFormulaModel support export/import >

[jira] [Commented] (SPARK-13733) Support initial weight distribution in personalized PageRank

2016-03-19 Thread Gayathri Murali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198327#comment-15198327 ] Gayathri Murali commented on SPARK-13733: - [~mengxr] Should the rest of the verti

[jira] [Resolved] (SPARK-13034) PySpark ml.classification support export/import

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-13034. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11707 [h

[jira] [Commented] (SPARK-14005) Make RDD more compatible with Scala's collection

2016-03-19 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203056#comment-15203056 ] zhengruifeng commented on SPARK-14005: -- ok, plz close this jira. > Make RDD more co

[jira] [Commented] (SPARK-13968) Use MurmurHash3 for hashing String features

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202003#comment-15202003 ] Joseph K. Bradley commented on SPARK-13968: --- I'm going to close this in favor o

[jira] [Commented] (SPARK-13629) Add binary toggle Param to CountVectorizer

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201993#comment-15201993 ] Joseph K. Bradley commented on SPARK-13629: --- [~mlnick] Thanks for handling thes

[jira] [Assigned] (SPARK-11319) PySpark silently accepts null values in non-nullable DataFrame fields.

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11319: Assignee: (was: Apache Spark) > PySpark silently accepts null values in non-nullable D

[jira] [Updated] (SPARK-13948) MiMa Check should catch if the visibility change to `private`

2016-03-19 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-13948: --- Component/s: Project Infra > MiMa Check should catch if the visibility change to `private` > --

[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200516#comment-15200516 ] Xiao Li commented on SPARK-13865: - This is the same as the https://issues.apache.org/jira

[jira] [Updated] (SPARK-13972) hive tests should fail if SQL generation failed

2016-03-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-13972: --- Assignee: Wenchen Fan > hive tests should fail if SQL generation failed > ---

[jira] [Updated] (SPARK-13776) Web UI is not available after ./sbin/start-master.sh

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13776: -- Assignee: Shixiong Zhu > Web UI is not available after ./sbin/start-master.sh > ---

[jira] [Resolved] (SPARK-10788) Decision Tree duplicates bins for unordered categorical features

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-10788. --- Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 9474 [ht

[jira] [Commented] (SPARK-12719) SQL generation support for generators (including UDTF)

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200419#comment-15200419 ] Apache Spark commented on SPARK-12719: -- User 'yy2016' has created a pull request for

[jira] [Commented] (SPARK-13461) Duplicated example code merge and cleanup

2016-03-19 Thread Gabor Liptak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203041#comment-15203041 ] Gabor Liptak commented on SPARK-13461: -- [~yinxusen] {{examples/src/main/scala/org/a

[jira] [Commented] (SPARK-13969) Extend input format that feature hashing can handle

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202007#comment-15202007 ] Joseph K. Bradley commented on SPARK-13969: --- I think HashingTF could be extende

[jira] [Updated] (SPARK-13960) HTTP-based JAR Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option

2016-03-19 Thread Ilya Ostrovskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Ostrovskiy updated SPARK-13960: Description: There is no option to specify which hostname/IP address the jar/file server l

[jira] [Commented] (SPARK-13968) Use MurmurHash3 for hashing String features

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200254#comment-15200254 ] Nick Pentreath commented on SPARK-13968: Sure, I will assign to you. But I'd like

[jira] [Created] (SPARK-13938) word2phrase feature created in ML

2016-03-19 Thread Steve Weng (JIRA)
Steve Weng created SPARK-13938: -- Summary: word2phrase feature created in ML Key: SPARK-13938 URL: https://issues.apache.org/jira/browse/SPARK-13938 Project: Spark Issue Type: New Feature

[jira] [Assigned] (SPARK-13958) Executor OOM due to unbounded growth of pointer array in Sorter

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13958: Assignee: (was: Apache Spark) > Executor OOM due to unbounded growth of pointer array

[jira] [Updated] (SPARK-10574) HashingTF should use MurmurHash3

2016-03-19 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-10574: -- Assignee: Yanbo Liang > HashingTF should use MurmurHash3 > > >

[jira] [Created] (SPARK-13951) PySpark ml.pipeline support export/import - nested Piplines

2016-03-19 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-13951: - Summary: PySpark ml.pipeline support export/import - nested Piplines Key: SPARK-13951 URL: https://issues.apache.org/jira/browse/SPARK-13951 Project: Spark

[jira] [Created] (SPARK-13988) Large history files block new applications from showing up in History UI.

2016-03-19 Thread Parth Brahmbhatt (JIRA)
Parth Brahmbhatt created SPARK-13988: Summary: Large history files block new applications from showing up in History UI. Key: SPARK-13988 URL: https://issues.apache.org/jira/browse/SPARK-13988 Pro

[jira] [Updated] (SPARK-10574) HashingTF should use MurmurHash3

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-10574: -- Issue Type: Sub-task (was: Improvement) Parent: SPARK-13964 > HashingTF should

[jira] [Commented] (SPARK-13886) ArrayType of BinaryType not supported in Row.equals method

2016-03-19 Thread MahmoudHanafy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198763#comment-15198763 ] MahmoudHanafy commented on SPARK-13886: --- I think List extends Seq !! In this case,

[jira] [Commented] (SPARK-12177) Update KafkaDStreams to new Kafka 0.9 Consumer API

2016-03-19 Thread Cody Koeninger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203027#comment-15203027 ] Cody Koeninger commented on SPARK-12177: Unless I'm misunderstanding your point,

[jira] [Assigned] (SPARK-13997) Use Hadoop 2.0 default value for compression in data sources

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13997: Assignee: (was: Apache Spark) > Use Hadoop 2.0 default value for compression in data s

[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200637#comment-15200637 ] JESSE CHEN commented on SPARK-13865: This maybe a TPC toolkit issue. Will be looking

[jira] [Created] (SPARK-13995) Constraints should take care of Cast

2016-03-19 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-13995: --- Summary: Constraints should take care of Cast Key: SPARK-13995 URL: https://issues.apache.org/jira/browse/SPARK-13995 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-13967) Add binary toggle Param to PySpark CountVectorizer

2016-03-19 Thread Nick Pentreath (JIRA)
Nick Pentreath created SPARK-13967: -- Summary: Add binary toggle Param to PySpark CountVectorizer Key: SPARK-13967 URL: https://issues.apache.org/jira/browse/SPARK-13967 Project: Spark Issue

[jira] [Commented] (SPARK-12719) SQL generation support for generators (including UDTF)

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199852#comment-15199852 ] Apache Spark commented on SPARK-12719: -- User 'yy2016' has created a pull request for

[jira] [Updated] (SPARK-13960) HTTP-based JAR Server doesn't respect spark.driver.host and there is no "spark.fileserver.host" option

2016-03-19 Thread Ilya Ostrovskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Ostrovskiy updated SPARK-13960: Description: There is no option to specify which hostname/IP address the jar/file server l

[jira] [Assigned] (SPARK-13992) Add support for off-heap caching

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13992: Assignee: Josh Rosen (was: Apache Spark) > Add support for off-heap caching > ---

[jira] [Commented] (SPARK-13967) Add binary toggle Param to PySpark CountVectorizer

2016-03-19 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201339#comment-15201339 ] Nick Pentreath commented on SPARK-13967: [~yuhaoyan] or [~bryanc] would you like

[jira] [Resolved] (SPARK-13989) Remove non-vectorized/unsafe-row parquet record reader

2016-03-19 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu resolved SPARK-13989. Resolution: Fixed Fix Version/s: 2.0.0 Issue resolved by pull request 11799 [https://github.

[jira] [Created] (SPARK-14016) Support high-precision decimals in vectorized parquet reader

2016-03-19 Thread Sameer Agarwal (JIRA)
Sameer Agarwal created SPARK-14016: -- Summary: Support high-precision decimals in vectorized parquet reader Key: SPARK-14016 URL: https://issues.apache.org/jira/browse/SPARK-14016 Project: Spark

[jira] [Commented] (SPARK-13986) Make `DeveloperApi`-annotated things public

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200489#comment-15200489 ] Apache Spark commented on SPARK-13986: -- User 'dongjoon-hyun' has created a pull requ

[jira] [Commented] (SPARK-12177) Update KafkaDStreams to new Kafka 0.9 Consumer API

2016-03-19 Thread Eugene Miretsky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203018#comment-15203018 ] Eugene Miretsky commented on SPARK-12177: - The new Kafka Java Consumer is using D

[jira] [Assigned] (SPARK-13976) do not remove sub-queries added by user when generate SQL

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13976: Assignee: Apache Spark > do not remove sub-queries added by user when generate SQL > -

[jira] [Created] (SPARK-13940) Predicate Transitive Closure Transformation

2016-03-19 Thread Alex Antonov (JIRA)
Alex Antonov created SPARK-13940: Summary: Predicate Transitive Closure Transformation Key: SPARK-13940 URL: https://issues.apache.org/jira/browse/SPARK-13940 Project: Spark Issue Type: Impro

[jira] [Updated] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-19 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-13937: -- Priority: Trivial (was: Minor) > PySpark ML JavaWrapper, variable _java_obj should not

[jira] [Updated] (SPARK-13938) word2phrase feature created in ML

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-13938: -- [~s4weng] "Critical" is inappropriate here. Please read https://cwiki.apache.org/confluence/display/SPARK/

[jira] [Assigned] (SPARK-913) log the size of each shuffle block in block manager

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-913: -- Assignee: Apache Spark > log the size of each shuffle block in block manager > -

[jira] [Created] (SPARK-13973) `ipython notebook` is going away...

2016-03-19 Thread Bogdan Pirvu (JIRA)
Bogdan Pirvu created SPARK-13973: Summary: `ipython notebook` is going away... Key: SPARK-13973 URL: https://issues.apache.org/jira/browse/SPARK-13973 Project: Spark Issue Type: Improvement

[jira] [Created] (SPARK-13974) sub-query names do not need to be globally unique while generate SQL

2016-03-19 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-13974: --- Summary: sub-query names do not need to be globally unique while generate SQL Key: SPARK-13974 URL: https://issues.apache.org/jira/browse/SPARK-13974 Project: Spark

[jira] [Resolved] (SPARK-13360) pyspark related enviroment variable is not propagated to driver in yarn-cluster mode

2016-03-19 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-13360. Resolution: Fixed Assignee: Jeff Zhang Fix Version/s: 2.0.0 > pyspark relat

[jira] [Updated] (SPARK-14001) support multi-children Union in SQLBuilder

2016-03-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-14001: --- Assignee: Wenchen Fan > support multi-children Union in SQLBuilder >

[jira] [Commented] (SPARK-13877) Consider removing Kafka modules from Spark / Spark Streaming

2016-03-19 Thread Hari Shreedharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200306#comment-15200306 ] Hari Shreedharan commented on SPARK-13877: -- You could have separate repos and se

[jira] [Created] (SPARK-13993) PySpark ml.feature.RFormula/RFormulaModel support export/import

2016-03-19 Thread Xusen Yin (JIRA)
Xusen Yin created SPARK-13993: - Summary: PySpark ml.feature.RFormula/RFormulaModel support export/import Key: SPARK-13993 URL: https://issues.apache.org/jira/browse/SPARK-13993 Project: Spark Is

[jira] [Assigned] (SPARK-13937) PySpark ML JavaWrapper, variable _java_obj should not be static

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13937: Assignee: (was: Apache Spark) > PySpark ML JavaWrapper, variable _java_obj should not

[jira] [Commented] (SPARK-13955) Spark in yarn mode fails

2016-03-19 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199198#comment-15199198 ] Sean Owen commented on SPARK-13955: --- Is this likely? the YARN tests succeed. There isn'

[jira] [Assigned] (SPARK-13719) Bad JSON record raises java.lang.ClassCastException

2016-03-19 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-13719: Assignee: (was: Apache Spark) > Bad JSON record raises java.lang.ClassCastException >

[jira] [Commented] (SPARK-13864) TPCDS query 74 returns wrong results compared to TPC official result set

2016-03-19 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198602#comment-15198602 ] Xiao Li commented on SPARK-13864: - This is the same issue as SPARK-13862. I think we can

[jira] [Updated] (SPARK-12719) SQL generation support for generators (including UDTF)

2016-03-19 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Lian updated SPARK-12719: --- Assignee: Wenchen Fan > SQL generation support for generators (including UDTF) >

[jira] [Commented] (SPARK-13865) TPCDS query 87 returns wrong results compared to TPC official result set

2016-03-19 Thread JESSE CHEN (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200886#comment-15200886 ] JESSE CHEN commented on SPARK-13865: You rock! > TPCDS query 87 returns wrong result

  1   2   3   4   5   6   >