[jira] [Commented] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139590#comment-16139590 ] Saisai Shao commented on SPARK-21819: - Then I think there should no issue in Spark, right?

[jira] [Commented] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Keith Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139586#comment-16139586 ] Keith Sun commented on SPARK-21819: --- [~vanzin], the third option works for my use case, i could add

[jira] [Resolved] (SPARK-21805) disable R vignettes code on Windows

2017-08-23 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-21805. -- Resolution: Fixed Assignee: Felix Cheung Fix Version/s: 2.3.0

[jira] [Updated] (SPARK-21745) Refactor ColumnVector hierarchy to make ColumnVector read-only and to introduce WritableColumnVector.

2017-08-23 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-21745: -- Description: This is a refactoring of {{ColumnVector}} hierarchy and related classes. # make

[jira] [Resolved] (SPARK-21807) The getAliasedConstraints function in LogicalPlan will take a long time when number of expressions is greater than 100

2017-08-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-21807. - Resolution: Fixed Fix Version/s: 2.3.0 > The getAliasedConstraints function in LogicalPlan will

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-08-23 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139507#comment-16139507 ] Takuya Ueshin commented on SPARK-21190: --- [~icexelloss] We can know the length of input from

[jira] [Commented] (SPARK-17321) YARN shuffle service should use good disk from yarn.nodemanager.local-dirs

2017-08-23 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139502#comment-16139502 ] Saisai Shao commented on SPARK-17321: - 1. if NM recovery is enabled, then yarn will provide a

[jira] [Commented] (SPARK-17321) YARN shuffle service should use good disk from yarn.nodemanager.local-dirs

2017-08-23 Thread lishuming (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139488#comment-16139488 ] lishuming commented on SPARK-17321: --- [~jerryshao] I agree with what you said, however there are some

[jira] [Commented] (SPARK-21733) ERROR executor.CoarseGrainedExecutorBackend: RECEIVED SIGNAL TERM

2017-08-23 Thread lishuming (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139482#comment-16139482 ] lishuming commented on SPARK-21733: --- [~1028344...@qq.com] Maybe you should check the executor's log to

[jira] [Commented] (SPARK-15689) Data source API v2

2017-08-23 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139479#comment-16139479 ] Wenchen Fan commented on SPARK-15689: - yea, `LogicalPlan` is an internal concept and we can't use it

[jira] [Commented] (SPARK-21660) Yarn ShuffleService failed to start when the chosen directory become read-only

2017-08-23 Thread lishuming (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139471#comment-16139471 ] lishuming commented on SPARK-21660: --- Sorry, this is a dup of

[jira] [Reopened] (SPARK-21816) The comment of Class ExchangeCoordinator exist a typing and context error

2017-08-23 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-21816: -- > The comment of Class ExchangeCoordinator exist a typing and context error >

[jira] [Resolved] (SPARK-21816) The comment of Class ExchangeCoordinator exist a typing and context error

2017-08-23 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-21816. -- Resolution: Invalid > The comment of Class ExchangeCoordinator exist a typing and context

[jira] [Closed] (SPARK-21816) The comment of Class ExchangeCoordinator exist a typing and context error

2017-08-23 Thread lufei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lufei closed SPARK-21816. - Resolution: Fixed > The comment of Class ExchangeCoordinator exist a typing and context error >

[jira] [Commented] (SPARK-21816) The comment of Class ExchangeCoordinator exist a typing and context error

2017-08-23 Thread lufei (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139455#comment-16139455 ] lufei commented on SPARK-21816: --- [~hyukjin.kwon] ,I'm sorry for this, I will colse this issue immediately.

[jira] [Commented] (SPARK-21770) ProbabilisticClassificationModel: Improve normalization of all-zero raw predictions

2017-08-23 Thread Weichen Xu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139364#comment-16139364 ] Weichen Xu commented on SPARK-21770: Hmm... `normalizeToProbabilitiesInPlace` is only effective in

[jira] [Commented] (SPARK-19954) Joining to a unioned DataFrame does not produce expected result.

2017-08-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139229#comment-16139229 ] Sean Owen commented on SPARK-19954: --- Might be a mistake about exactly what other change resolved this.

[jira] [Commented] (SPARK-19954) Joining to a unioned DataFrame does not produce expected result.

2017-08-23 Thread Adam Heinermann (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139058#comment-16139058 ] Adam Heinermann commented on SPARK-19954: - How is an issue that affects version 2.1.0 resolved as

[jira] [Commented] (SPARK-21535) Reduce memory requirement for CrossValidator and TrainValidationSplit

2017-08-23 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139023#comment-16139023 ] yuhao yang commented on SPARK-21535: Thank for for the comments. > Reduce memory requirement for

[jira] [Commented] (SPARK-21752) Config spark.jars.packages is ignored in SparkSession config

2017-08-23 Thread Jakub Nowacki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138948#comment-16138948 ] Jakub Nowacki commented on SPARK-21752: --- OK I did one more extra test and, indeed, on the newest

[jira] [Resolved] (SPARK-21817) Pass FSPermissions to LocatedFileStatus from InMemoryFileIndex

2017-08-23 Thread Ewan Higgs (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs resolved SPARK-21817. Resolution: Invalid This was caused by a change in a stable/evolving interface which previously

[jira] [Commented] (SPARK-21817) Pass FSPermissions to LocatedFileStatus from InMemoryFileIndex

2017-08-23 Thread Ewan Higgs (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138905#comment-16138905 ] Ewan Higgs commented on SPARK-21817: {quote} Ewan: do a patch there with a new test method (where?) &

[jira] [Closed] (SPARK-17771) Allow start-master/slave scripts to start in the foreground

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun closed SPARK-17771. - Resolution: Duplicate > Allow start-master/slave scripts to start in the foreground >

[jira] [Reopened] (SPARK-17771) Allow start-master/slave scripts to start in the foreground

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reopened SPARK-17771: --- > Allow start-master/slave scripts to start in the foreground >

[jira] [Commented] (SPARK-12449) Pushing down arbitrary logical plans to data sources

2017-08-23 Thread Evan Chan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138856#comment-16138856 ] Evan Chan commented on SPARK-12449: --- Andrew and others: Is there a plan to make this CatalystSource

[jira] [Closed] (SPARK-17891) SQL-based three column join loses first column

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun closed SPARK-17891. - Resolution: Duplicate > SQL-based three column join loses first column >

[jira] [Commented] (SPARK-18656) org.apache.spark.sql.execution.stat.StatFunctions#multipleApproxQuantiles requires too much memory in case of many columns

2017-08-23 Thread poplav (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138853#comment-16138853 ] poplav commented on SPARK-18656: [~barrybecker4], Any more insights into this. I am going to have to do

[jira] [Reopened] (SPARK-17891) SQL-based three column join loses first column

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun reopened SPARK-17891: --- > SQL-based three column join loses first column >

[jira] [Updated] (SPARK-18594) Name Validation of Databases/Tables

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-18594: -- Fix Version/s: 2.1.0 > Name Validation of Databases/Tables >

[no subject]

2017-08-23 Thread Wei Zheng

[jira] [Updated] (SPARK-19307) SPARK-17387 caused ignorance of conf object passed to SparkContext:

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-19307: -- Fix Version/s: 2.1.1 2.2.0 > SPARK-17387 caused ignorance of conf object

[jira] [Commented] (SPARK-19307) SPARK-17387 caused ignorance of conf object passed to SparkContext:

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138840#comment-16138840 ] Dongjoon Hyun commented on SPARK-19307: --- Hi, [~irinatruong]. Yes. It's available in 2.1.1. Maybe,

[jira] [Updated] (SPARK-18415) Weird Plan Output when CTE used in RunnableCommand

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-18415: -- Fix Version/s: 2.1.0 > Weird Plan Output when CTE used in RunnableCommand >

[jira] [Updated] (SPARK-21102) Refresh command is too aggressive in parsing

2017-08-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21102: Fix Version/s: 2.3.0 > Refresh command is too aggressive in parsing >

[jira] [Commented] (SPARK-20754) Add Function Alias For MOD/TRUNCT/POSITION

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138829#comment-16138829 ] Dongjoon Hyun commented on SPARK-20754: --- Hi, [~smilegator]. Could you set [~q79969786] as

[jira] [Assigned] (SPARK-21102) Refresh command is too aggressive in parsing

2017-08-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-21102: --- Assignee: Anton Okolnychyi > Refresh command is too aggressive in parsing >

[jira] [Comment Edited] (SPARK-6761) Approximate quantile

2017-08-23 Thread poplav (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138809#comment-16138809 ] poplav edited comment on SPARK-6761 at 8/23/17 6:22 PM: Question: Say I have a

[jira] [Updated] (SPARK-20754) Add Function Alias For MOD/TRUNCT/POSITION

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-20754: -- Fix Version/s: 2.3.0 > Add Function Alias For MOD/TRUNCT/POSITION >

[jira] [Updated] (SPARK-20953) Add hash map metrics to aggregate and join

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-20953: -- Fix Version/s: 2.3.0 > Add hash map metrics to aggregate and join >

[jira] [Commented] (SPARK-19571) tests are failing to run on Windows with another instance Derby error with Hadoop 2.6.5

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138821#comment-16138821 ] Dongjoon Hyun commented on SPARK-19571: --- Hi, [~hyukjin.kwon]. Could you set `Fix Version`? Thanks!

[jira] [Updated] (SPARK-21256) Add WithSQLConf to Catalyst Test

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-21256: -- Fix Version/s: 2.3.0 > Add WithSQLConf to Catalyst Test > > >

[jira] [Commented] (SPARK-6761) Approximate quantile

2017-08-23 Thread poplav (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138809#comment-16138809 ] poplav commented on SPARK-6761: --- Question: Say I have a DataFrame of 1000 columns. I want approximate

[jira] [Updated] (SPARK-18539) Cannot filter by nonexisting column in parquet file

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-18539: -- Fix Version/s: 2.2.0 > Cannot filter by nonexisting column in parquet file >

[jira] [Updated] (SPARK-21578) Add JavaSparkContextSuite

2017-08-23 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-21578: -- Fix Version/s: 2.3.0 > Add JavaSparkContextSuite > - > >

[jira] [Commented] (SPARK-21817) Pass FSPermissions to LocatedFileStatus from InMemoryFileIndex

2017-08-23 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138685#comment-16138685 ] Steve Loughran commented on SPARK-21817: API is tagged as stable/evolving; it's clearly in use

[jira] [Commented] (SPARK-21817) Pass FSPermissions to LocatedFileStatus from InMemoryFileIndex

2017-08-23 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138683#comment-16138683 ] Steve Loughran commented on SPARK-21817: I think it's a regression in HDFS-6984; the superclass

[jira] [Commented] (SPARK-21770) ProbabilisticClassificationModel: Improve normalization of all-zero raw predictions

2017-08-23 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138680#comment-16138680 ] Yanbo Liang commented on SPARK-21770: - [~srowen] Of course, we should understand what outputs [0, 0,

[jira] [Commented] (SPARK-21817) Pass FSPermissions to LocatedFileStatus from InMemoryFileIndex

2017-08-23 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138672#comment-16138672 ] Marcelo Vanzin commented on SPARK-21817: Not sure if it counts as a regression since the behavior

[jira] [Commented] (SPARK-21817) Pass FSPermissions to LocatedFileStatus from InMemoryFileIndex

2017-08-23 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138664#comment-16138664 ] Steve Loughran commented on SPARK-21817: This a regression in HDFS? > Pass FSPermissions to

[jira] [Resolved] (SPARK-21501) Spark shuffle index cache size should be memory based

2017-08-23 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-21501. --- Resolution: Fixed Assignee: Sanket Reddy Fix Version/s: 2.3.0 > Spark

[jira] [Updated] (SPARK-21807) The getAliasedConstraints function in LogicalPlan will take a long time when number of expressions is greater than 100

2017-08-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-21807: Target Version/s: 2.3.0 > The getAliasedConstraints function in LogicalPlan will take a long time when >

[jira] [Assigned] (SPARK-21807) The getAliasedConstraints function in LogicalPlan will take a long time when number of expressions is greater than 100

2017-08-23 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-21807: --- Assignee: eaton > The getAliasedConstraints function in LogicalPlan will take a long time when >

[jira] [Commented] (SPARK-15689) Data source API v2

2017-08-23 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138607#comment-16138607 ] Reynold Xin commented on SPARK-15689: - Not the author but my guess is that the other approach

[jira] [Commented] (SPARK-15799) Release SparkR on CRAN

2017-08-23 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138602#comment-16138602 ] Shivaram Venkataraman commented on SPARK-15799: --- The email I got from CRAN is pasted below.

[jira] [Commented] (SPARK-21817) Pass FSPermissions to LocatedFileStatus from InMemoryFileIndex

2017-08-23 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138552#comment-16138552 ] Marcelo Vanzin commented on SPARK-21817: That HDFS change is only in Hadoop 3, which is not

[jira] [Commented] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138550#comment-16138550 ] Marcelo Vanzin commented on SPARK-21819: There are a few ways to control which Hadoop

[jira] [Commented] (SPARK-15689) Data source API v2

2017-08-23 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138509#comment-16138509 ] Andrew Ash commented on SPARK-15689: Can the authors of this document add a section contrasting the

[jira] [Comment Edited] (SPARK-21814) build spark current master can not use hive metadatamysql

2017-08-23 Thread xinzhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138075#comment-16138075 ] xinzhang edited comment on SPARK-21814 at 8/23/17 3:32 PM: --- Thanks your reply.

[jira] [Commented] (SPARK-12449) Pushing down arbitrary logical plans to data sources

2017-08-23 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138501#comment-16138501 ] Andrew Ash commented on SPARK-12449: Relevant slides:

[jira] [Updated] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Keith Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Sun updated SPARK-21819: -- Attachment: yarnsparkutil.jpg > UserGroupInformation initialization in SparkHadoopUtilwill overwrite

[jira] [Commented] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Keith Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138461#comment-16138461 ] Keith Sun commented on SPARK-21819: --- [~jerryshao], i just attach the UGI update in SparkHadoopUtil >

[jira] [Commented] (SPARK-11248) Spark hivethriftserver is using the wrong user to while getting HDFS permissions

2017-08-23 Thread wuchang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138443#comment-16138443 ] wuchang commented on SPARK-11248: - +1 I have met exactly the same problem.my spark version is 2.0.0.

[jira] [Comment Edited] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Kumaresh C R (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138405#comment-16138405 ] Kumaresh C R edited comment on SPARK-21820 at 8/23/17 2:24 PM: ---

[jira] [Commented] (SPARK-21799) KMeans performance regression (5-6x slowdown) in Spark 2.2

2017-08-23 Thread zakaria hili (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138408#comment-16138408 ] zakaria hili commented on SPARK-21799: -- [~Siddharth Murching], sorry about that, I think that the

[jira] [Commented] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Kumaresh C R (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138405#comment-16138405 ] Kumaresh C R commented on SPARK-21820: -- [~hyukjin.kwon]: Sound great.. We will wait for your

[jira] [Comment Edited] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Kumaresh C R (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138405#comment-16138405 ] Kumaresh C R edited comment on SPARK-21820 at 8/23/17 2:13 PM: ---

[jira] [Commented] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Keith Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138404#comment-16138404 ] Keith Sun commented on SPARK-21819: --- As the UGI is static and shared by all the component , is it

[jira] [Updated] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-21820: - Component/s: (was: Spark Core) SQL > csv option "multiLine" as "true" not

[jira] [Commented] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138379#comment-16138379 ] Hyukjin Kwon commented on SPARK-21820: -- I investigated this newline stuff few times before. For

[jira] [Commented] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138365#comment-16138365 ] Hyukjin Kwon commented on SPARK-21820: -- I think the preferable format should be {{format("csv")}}

[jira] [Commented] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138362#comment-16138362 ] Sean Owen commented on SPARK-21820: --- The code you're using isn't in Spark though. It's been migrated to

[jira] [Commented] (SPARK-21172) EOFException reached end of stream in UnsafeRowSerializer

2017-08-23 Thread Lasantha Fernando (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138361#comment-16138361 ] Lasantha Fernando commented on SPARK-21172: --- I've encountered the same issue with Spark 2.1.1

[jira] [Commented] (SPARK-17321) YARN shuffle service should use good disk from yarn.nodemanager.local-dirs

2017-08-23 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138347#comment-16138347 ] Thomas Graves commented on SPARK-17321: --- Yes that sounds good. It wouldn't hurt to verify the

[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python

2017-08-23 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138340#comment-16138340 ] Li Jin commented on SPARK-21190: [~ueshin], thanks for the summary. +1 for this API. Although the

[jira] [Comment Edited] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Kumaresh C R (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138324#comment-16138324 ] Kumaresh C R edited comment on SPARK-21820 at 8/23/17 1:21 PM: ---

[jira] [Comment Edited] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Kumaresh C R (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138327#comment-16138327 ] Kumaresh C R edited comment on SPARK-21820 at 8/23/17 1:20 PM: --- [~sowen]

[jira] [Commented] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Kumaresh C R (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138327#comment-16138327 ] Kumaresh C R commented on SPARK-21820: -- @Sean Owen: This is an issue with spark databricks-CSV

[jira] [Commented] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Kumaresh C R (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138324#comment-16138324 ] Kumaresh C R commented on SPARK-21820: -- [~hyukjin.kwon]: Could you please help us here ?This issue

[jira] [Commented] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138322#comment-16138322 ] Sean Owen commented on SPARK-21820: --- You need to use the built-in Spark CSV support if you're reporting

[jira] [Updated] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Kumaresh C R (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kumaresh C R updated SPARK-21820: - Attachment: windows_CRLF.csv > csv option "multiLine" as "true" not parsing windows line feed

[jira] [Created] (SPARK-21820) csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly

2017-08-23 Thread Kumaresh C R (JIRA)
Kumaresh C R created SPARK-21820: Summary: csv option "multiLine" as "true" not parsing windows line feed (CR LF) properly Key: SPARK-21820 URL: https://issues.apache.org/jira/browse/SPARK-21820

[jira] [Commented] (SPARK-17321) YARN shuffle service should use good disk from yarn.nodemanager.local-dirs

2017-08-23 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138320#comment-16138320 ] Saisai Shao commented on SPARK-17321: - We're facing the same issue. I think YARN shuffle service

[jira] [Commented] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138308#comment-16138308 ] Saisai Shao commented on SPARK-21819: - I'm not sure if Spark expose the user API to set

[jira] [Commented] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138302#comment-16138302 ] Sean Owen commented on SPARK-21819: --- Would it suffice to add this configuration somehow after Spark

[jira] [Commented] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138300#comment-16138300 ] Saisai Shao commented on SPARK-21819: - I think here because `Configuration` object created in the

[jira] [Commented] (SPARK-12157) Support numpy types as return values of Python UDFs

2017-08-23 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138299#comment-16138299 ] Maciej Szymkiewicz commented on SPARK-12157: [~felixcheung] IMHO it is not worth fixing. It

[jira] [Commented] (SPARK-21817) Pass FSPermissions to LocatedFileStatus from InMemoryFileIndex

2017-08-23 Thread Ewan Higgs (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138253#comment-16138253 ] Ewan Higgs commented on SPARK-21817: {quote}Can this be accomplished with a change that's still

[jira] [Updated] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Keith Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Sun updated SPARK-21819: -- Description: When submit job in Java or Scala code to ,the initialization of SparkHadoopUtil will

[jira] [Updated] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Keith Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Sun updated SPARK-21819: -- Description: When submit job in Java or Scala code to ,the initialization of SparkHadoopUtil will

[jira] [Commented] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138223#comment-16138223 ] Sean Owen commented on SPARK-21819: --- Hm, can that be considered supported though? if you've initialized

[jira] [Updated] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Keith Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Sun updated SPARK-21819: -- Description: When submit job in Java or Scala code to ,the initialization of SparkHadoopUtil will

[jira] [Updated] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Keith Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Sun updated SPARK-21819: -- Environment: Ubuntu14.04 Spark2.10/2.11 (I checked the github of 2.20 , it exist there as well)

[jira] [Created] (SPARK-21819) UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config

2017-08-23 Thread Keith Sun (JIRA)
Keith Sun created SPARK-21819: - Summary: UserGroupInformation initialization in SparkHadoopUtilwill overwrite user config Key: SPARK-21819 URL: https://issues.apache.org/jira/browse/SPARK-21819 Project:

[jira] [Commented] (SPARK-21770) ProbabilisticClassificationModel: Improve normalization of all-zero raw predictions

2017-08-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138220#comment-16138220 ] Sean Owen commented on SPARK-21770: --- No - it would be better to understand what outputs [0,0,0] to

[jira] [Created] (SPARK-21818) MultivariateOnlineSummarizer.variance generate negative result

2017-08-23 Thread Weichen Xu (JIRA)
Weichen Xu created SPARK-21818: -- Summary: MultivariateOnlineSummarizer.variance generate negative result Key: SPARK-21818 URL: https://issues.apache.org/jira/browse/SPARK-21818 Project: Spark

[jira] [Updated] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-08-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21476: -- Priority: Minor (was: Major) Issue Type: Improvement (was: Bug) > RandomForest classification

[jira] [Updated] (SPARK-21817) Pass FSPermissions to LocatedFileStatus from InMemoryFileIndex

2017-08-23 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-21817: -- Priority: Minor (was: Major) Fix Version/s: (was: 2.3.0) Issue Type: Improvement

[jira] [Comment Edited] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-08-23 Thread Saurabh Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138214#comment-16138214 ] Saurabh Agrawal edited comment on SPARK-21476 at 8/23/17 10:44 AM: ---

[jira] [Updated] (SPARK-21817) Pass FSPermissions to LocatedFileStatus from InMemoryFileIndex

2017-08-23 Thread Ewan Higgs (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewan Higgs updated SPARK-21817: --- Attachment: SPARK-21817.001.patch Attaching simple fix that will no longer NPE on Hadoop head. >

[jira] [Commented] (SPARK-21476) RandomForest classification model not using broadcast in transform

2017-08-23 Thread Saurabh Agrawal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138214#comment-16138214 ] Saurabh Agrawal commented on SPARK-21476: - [~peng.m...@intel.com] Under what circumstances will

  1   2   >