[jira] [Commented] (SPARK-20336) spark.read.csv() with wholeFile=True option fails to read non ASCII unicode characters

2017-04-16 Thread HanCheol Cho (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970259#comment-15970259 ] HanCheol Cho commented on SPARK-20336: -- Hi, [~hyukjin.kwon] I found that this case

[jira] [Issue Comment Deleted] (SPARK-9278) DataFrameWriter.insertInto inserts incorrect data

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-9278: Comment: was deleted (was: The result might be definitely different as I ran the codes below with m

[jira] [Resolved] (SPARK-9278) DataFrameWriter.insertInto inserts incorrect data

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-9278. - Resolution: Not A Problem I tried to reproduce the codes above. {code} import pandas pdf = panda

[jira] [Commented] (SPARK-20336) spark.read.csv() with wholeFile=True option fails to read non ASCII unicode characters

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970262#comment-15970262 ] Hyukjin Kwon commented on SPARK-20336: -- Sure, definitely. No need to be in a harry.

[jira] [Comment Edited] (SPARK-20336) spark.read.csv() with wholeFile=True option fails to read non ASCII unicode characters

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970262#comment-15970262 ] Hyukjin Kwon edited comment on SPARK-20336 at 4/16/17 7:16 AM:

[jira] [Resolved] (SPARK-10109) NPE when saving Parquet To HDFS

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-10109. -- Resolution: Duplicate ^ i am resolving this as it looks a subset of SPARK-20038. Please reopen

[jira] [Commented] (SPARK-10294) When Parquet writer's close method throws an exception, we will call close again and trigger a NPE

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970266#comment-15970266 ] Hyukjin Kwon commented on SPARK-10294: -- Would this be resolvable maybe? > When Parq

[jira] [Created] (SPARK-20349) ListFunction returns duplicate functions after using persistent functions

2017-04-16 Thread Xiao Li (JIRA)
Xiao Li created SPARK-20349: --- Summary: ListFunction returns duplicate functions after using persistent functions Key: SPARK-20349 URL: https://issues.apache.org/jira/browse/SPARK-20349 Project: Spark

[jira] [Resolved] (SPARK-10746) count ( distinct columnref) over () returns wrong result set

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-10746. -- Resolution: Not A Problem I am resolving this per the comment above. I think this applies to .

[jira] [Updated] (SPARK-20349) ListFunctions returns duplicate functions after using persistent functions

2017-04-16 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20349: Summary: ListFunctions returns duplicate functions after using persistent functions (was: ListFunction ret

[jira] [Resolved] (SPARK-11186) Caseness inconsistency between SQLContext and HiveContext

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-11186. -- Resolution: Cannot Reproduce I can't run the codes as reported. I am resolving this per ... {q

[jira] [Assigned] (SPARK-20349) ListFunctions returns duplicate functions after using persistent functions

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20349: Assignee: Apache Spark (was: Xiao Li) > ListFunctions returns duplicate functions after u

[jira] [Assigned] (SPARK-20349) ListFunctions returns duplicate functions after using persistent functions

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20349: Assignee: Xiao Li (was: Apache Spark) > ListFunctions returns duplicate functions after u

[jira] [Commented] (SPARK-20349) ListFunctions returns duplicate functions after using persistent functions

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970270#comment-15970270 ] Apache Spark commented on SPARK-20349: -- User 'gatorsmile' has created a pull request

[jira] [Commented] (SPARK-12259) Kryo/javaSerialization encoder are not composable

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970280#comment-15970280 ] Hyukjin Kwon commented on SPARK-12259: -- [~smilegator], I just happened to try to rep

[jira] [Commented] (SPARK-20346) sum aggregate over empty Dataset gives null

2017-04-16 Thread Jacek Laskowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970281#comment-15970281 ] Jacek Laskowski commented on SPARK-20346: - [~hyukjin.kwon] Caught me! I didn't th

[jira] [Resolved] (SPARK-12677) Lazy file discovery for parquet

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-12677. -- Resolution: Duplicate I just realised that there is the option for this case. I am resolving th

[jira] [Reopened] (SPARK-12677) Lazy file discovery for parquet

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-12677: -- > Lazy file discovery for parquet > --- > > Key: SPARK-

[jira] [Resolved] (SPARK-12677) Lazy file discovery for parquet

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-12677. -- Resolution: Not A Problem It sounds it was decided to explicitly throws an exception in SPARK-1

[jira] [Issue Comment Deleted] (SPARK-12677) Lazy file discovery for parquet

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-12677: - Comment: was deleted (was: I just realised that there is the option for this case. I am resolving

[jira] [Commented] (SPARK-20346) sum aggregate over empty Dataset gives null

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970285#comment-15970285 ] Hyukjin Kwon commented on SPARK-20346: -- Actually, I and [~a1ray] had a discussion ab

[jira] [Comment Edited] (SPARK-20346) sum aggregate over empty Dataset gives null

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970285#comment-15970285 ] Hyukjin Kwon edited comment on SPARK-20346 at 4/16/17 8:24 AM:

[jira] [Commented] (SPARK-12677) Lazy file discovery for parquet

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970286#comment-15970286 ] Hyukjin Kwon commented on SPARK-12677: -- Please reopen this if anyone would like to s

[jira] [Resolved] (SPARK-13301) PySpark Dataframe return wrong results with custom UDF

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-13301. -- Resolution: Cannot Reproduce Per the description in the JIRA, I can't reproduce this. I can onl

[jira] [Resolved] (SPARK-13491) Issue using table alias in Spark SQL case statement

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-13491. -- Resolution: Cannot Reproduce {code} val employee1 = Seq(Tuple2("Hyukjin", 1), Tuple2("Tom", 2))

[jira] [Commented] (SPARK-13644) Add the source file name and line into Logger when an exception occurs in the generated code

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970295#comment-15970295 ] Hyukjin Kwon commented on SPARK-13644: -- gentle ping [~kiszk], is this resolvable? >

[jira] [Commented] (SPARK-13680) Java UDAF with more than one intermediate argument returns wrong results

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970296#comment-15970296 ] Hyukjin Kwon commented on SPARK-13680: -- Could we narrow down the scope? It seems the

[jira] [Resolved] (SPARK-14057) sql time stamps do not respect time zones

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-14057. -- Resolution: Duplicate Apparently it seems fixed in SPARK-18936. If it refers other datasources

[jira] [Resolved] (SPARK-14097) Spark SQL Optimization is not consistent

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-14097. -- Resolution: Invalid Is the issue different plans? It really looks hard to read and at least for

[jira] [Commented] (SPARK-14584) Improve recognition of non-nullability in Dataset transformations

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970306#comment-15970306 ] Hyukjin Kwon commented on SPARK-14584: -- [~joshrosen], it seems now it recognise the

[jira] [Commented] (SPARK-14651) CREATE TEMPORARY TABLE is not supported yet

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970307#comment-15970307 ] Hyukjin Kwon commented on SPARK-14651: -- I just added affected version as I can repro

[jira] [Updated] (SPARK-14651) CREATE TEMPORARY TABLE is not supported yet

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-14651: - Affects Version/s: 2.2.0 > CREATE TEMPORARY TABLE is not supported yet >

[jira] [Commented] (SPARK-14764) Spark SQL documentation should be more precise about which SQL features it supports

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970310#comment-15970310 ] Hyukjin Kwon commented on SPARK-14764: -- How about https://docs.databricks.com/spark/

[jira] [Commented] (SPARK-15071) Check the result of all TPCDS queries

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970312#comment-15970312 ] Hyukjin Kwon commented on SPARK-15071: -- [~nirmannarang] How has it been going? > Ch

[jira] [Commented] (SPARK-13644) Add the source file name and line into Logger when an exception occurs in the generated code

2017-04-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970313#comment-15970313 ] Kazuaki Ishizaki commented on SPARK-13644: -- It is not resolved yet. But, let me

[jira] [Closed] (SPARK-13644) Add the source file name and line into Logger when an exception occurs in the generated code

2017-04-16 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kazuaki Ishizaki closed SPARK-13644. Resolution: Unresolved > Add the source file name and line into Logger when an exception oc

[jira] [Commented] (SPARK-14584) Improve recognition of non-nullability in Dataset transformations

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970316#comment-15970316 ] Hyukjin Kwon commented on SPARK-14584: -- [~joshrosen], I am pretty sure this is fixed

[jira] [Assigned] (SPARK-20344) Duplicate call in FairSchedulableBuilder.addTaskSetManager

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20344: Assignee: (was: Apache Spark) > Duplicate call in FairSchedulableBuilder.addTaskSetMan

[jira] [Commented] (SPARK-20344) Duplicate call in FairSchedulableBuilder.addTaskSetManager

2017-04-16 Thread Robert Stupp (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970319#comment-15970319 ] Robert Stupp commented on SPARK-20344: -- Yea, true. I've updated the patch and [submi

[jira] [Assigned] (SPARK-20344) Duplicate call in FairSchedulableBuilder.addTaskSetManager

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20344: Assignee: Apache Spark > Duplicate call in FairSchedulableBuilder.addTaskSetManager >

[jira] [Commented] (SPARK-20344) Duplicate call in FairSchedulableBuilder.addTaskSetManager

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970318#comment-15970318 ] Apache Spark commented on SPARK-20344: -- User 'snazy' has created a pull request for

[jira] [Resolved] (SPARK-15848) Spark unable to read partitioned table in avro format and column name in upper case

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-15848. -- Resolution: Cannot Reproduce I am resolving this per the comment above ^. > Spark unable to re

[jira] [Comment Edited] (SPARK-15848) Spark unable to read partitioned table in avro format and column name in upper case

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970333#comment-15970333 ] Hyukjin Kwon edited comment on SPARK-15848 at 4/16/17 10:57 AM: ---

[jira] [Commented] (SPARK-19851) Add support for EVERY and ANY (SOME) aggregates

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970337#comment-15970337 ] Apache Spark commented on SPARK-19851: -- User 'ptkool' has created a pull request for

[jira] [Updated] (SPARK-16544) Support for conversion from compatible schema for Parquet data source when data types are not matched

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16544: - Affects Version/s: 2.2.0 > Support for conversion from compatible schema for Parquet data source

[jira] [Resolved] (SPARK-16562) Do not allow downcast in INT32 based types for non-vectorized Parquet reader

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-16562. -- Resolution: Not A Bug This seems not a problem to me. Please my PR linked. > Do not allow down

[jira] [Resolved] (SPARK-16562) Do not allow downcast in INT32 based types for non-vectorized Parquet reader

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-16562. -- Resolution: Invalid > Do not allow downcast in INT32 based types for non-vectorized Parquet rea

[jira] [Reopened] (SPARK-16562) Do not allow downcast in INT32 based types for non-vectorized Parquet reader

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-16562: -- > Do not allow downcast in INT32 based types for non-vectorized Parquet reader > --

[jira] [Comment Edited] (SPARK-16562) Do not allow downcast in INT32 based types for non-vectorized Parquet reader

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970340#comment-15970340 ] Hyukjin Kwon edited comment on SPARK-16562 at 4/16/17 11:16 AM: ---

[jira] [Resolved] (SPARK-16604) Spark2.0 fail in executing the sql statement which includes partition field in the "select" statement while spark1.6 supports

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-16604. -- Resolution: Cannot Reproduce It sounds almost impossible to reproduce. I am resolving this. Unl

[jira] [Commented] (SPARK-16892) flatten function to get flat array (or map) column from array of array (or array of map) column

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970348#comment-15970348 ] Hyukjin Kwon commented on SPARK-16892: -- Maybe you are looking for somethine like thi

[jira] [Commented] (SPARK-14764) Spark SQL documentation should be more precise about which SQL features it supports

2017-04-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970350#comment-15970350 ] Sean Owen commented on SPARK-14764: --- Can we simply port that document into the main pro

[jira] [Resolved] (SPARK-19608) setup.py missing reference to pyspark.ml.param

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-19608. -- Resolution: Duplicate Seems added in https://github.com/apache/spark/commit/965c82d8c4b7f2d4df

[jira] [Commented] (SPARK-20023) Can not see table comment when describe formatted table

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970360#comment-15970360 ] Apache Spark commented on SPARK-20023: -- User 'sujith71955' has created a pull reques

[jira] [Commented] (SPARK-20023) Can not see table comment when describe formatted table

2017-04-16 Thread Sujith (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970365#comment-15970365 ] Sujith commented on SPARK-20023: @chenerlu, your point is right, after executing the alte

[jira] [Commented] (SPARK-14764) Spark SQL documentation should be more precise about which SQL features it supports

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970368#comment-15970368 ] Hyukjin Kwon commented on SPARK-14764: -- (For me, I would like to have this one but I

[jira] [Created] (SPARK-20350) Apply Complementation Laws during boolean expression simplification

2017-04-16 Thread Michael Styles (JIRA)
Michael Styles created SPARK-20350: -- Summary: Apply Complementation Laws during boolean expression simplification Key: SPARK-20350 URL: https://issues.apache.org/jira/browse/SPARK-20350 Project: Spar

[jira] [Commented] (SPARK-20350) Apply Complementation Laws during boolean expression simplification

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970372#comment-15970372 ] Apache Spark commented on SPARK-20350: -- User 'ptkool' has created a pull request for

[jira] [Assigned] (SPARK-20350) Apply Complementation Laws during boolean expression simplification

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20350: Assignee: Apache Spark > Apply Complementation Laws during boolean expression simplificati

[jira] [Assigned] (SPARK-20350) Apply Complementation Laws during boolean expression simplification

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20350: Assignee: (was: Apache Spark) > Apply Complementation Laws during boolean expression s

[jira] [Commented] (SPARK-13680) Java UDAF with more than one intermediate argument returns wrong results

2017-04-16 Thread Yael Aharon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970390#comment-15970390 ] Yael Aharon commented on SPARK-13680: - This has been fixed in spark 1.6. It can proba

[jira] [Resolved] (SPARK-19740) Spark executor always runs as root when running on mesos

2017-04-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19740. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17109 [https://github.co

[jira] [Assigned] (SPARK-19740) Spark executor always runs as root when running on mesos

2017-04-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-19740: - Assignee: Ji Yan > Spark executor always runs as root when running on mesos > --

[jira] [Resolved] (SPARK-20343) SBT master build for Hadoop 2.6 in Jenkins fails due to Avro version resolution

2017-04-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20343. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17642 [https://github.co

[jira] [Assigned] (SPARK-20343) SBT master build for Hadoop 2.6 in Jenkins fails due to Avro version resolution

2017-04-16 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-20343: - Assignee: Hyukjin Kwon > SBT master build for Hadoop 2.6 in Jenkins fails due to Avro version >

[jira] [Comment Edited] (SPARK-13680) Java UDAF with more than one intermediate argument returns wrong results

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970399#comment-15970399 ] Hyukjin Kwon edited comment on SPARK-13680 at 4/16/17 1:40 PM:

[jira] [Commented] (SPARK-13680) Java UDAF with more than one intermediate argument returns wrong results

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970399#comment-15970399 ] Hyukjin Kwon commented on SPARK-13680: -- Thank you so much for your confirmation. I a

[jira] [Resolved] (SPARK-13680) Java UDAF with more than one intermediate argument returns wrong results

2017-04-16 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-13680. -- Resolution: Cannot Reproduce > Java UDAF with more than one intermediate argument returns wrong

[jira] [Commented] (SPARK-20343) SBT master build for Hadoop 2.6 in Jenkins fails due to Avro version resolution

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970424#comment-15970424 ] Apache Spark commented on SPARK-20343: -- User 'HyukjinKwon' has created a pull reques

[jira] [Resolved] (SPARK-20278) Disable 'multiple_dots_linter' lint rule that is against project's code style

2017-04-16 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-20278. -- Resolution: Fixed Assignee: Hyukjin Kwon Fix Version/s: 2.2.0 Targe

[jira] [Commented] (SPARK-20307) SparkR: pass on setHandleInvalid to spark.mllib functions that use StringIndexer

2017-04-16 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970475#comment-15970475 ] Felix Cheung commented on SPARK-20307: -- Thanks for reporting this! Sounds like we sh

[jira] [Commented] (SPARK-18406) Race between end-of-task and completion iterator read lock release

2017-04-16 Thread Yongqin Xiao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970496#comment-15970496 ] Yongqin Xiao commented on SPARK-18406: -- Thanks Josh for the quick response! This iss

[jira] [Commented] (SPARK-20335) Children expressions of Hive UDF impacts the determinism of Hive UDF

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970520#comment-15970520 ] Apache Spark commented on SPARK-20335: -- User 'gatorsmile' has created a pull request

[jira] [Commented] (SPARK-19828) R to support JSON array in column from_json

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970536#comment-15970536 ] Apache Spark commented on SPARK-19828: -- User 'HyukjinKwon' has created a pull reques

[jira] [Commented] (SPARK-9478) Add sample weights to Random Forest

2017-04-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970541#comment-15970541 ] Joseph K. Bradley commented on SPARK-9478: -- By the way, one design choice which h

[jira] [Comment Edited] (SPARK-9478) Add sample weights to Random Forest

2017-04-16 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-9478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970541#comment-15970541 ] Joseph K. Bradley edited comment on SPARK-9478 at 4/16/17 10:36 PM:

[jira] [Created] (SPARK-20351) Add trait hasTrainingSummary to replace the duplicate code

2017-04-16 Thread yuhao yang (JIRA)
yuhao yang created SPARK-20351: -- Summary: Add trait hasTrainingSummary to replace the duplicate code Key: SPARK-20351 URL: https://issues.apache.org/jira/browse/SPARK-20351 Project: Spark Issue

[jira] [Commented] (SPARK-20351) Add trait hasTrainingSummary to replace the duplicate code

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970556#comment-15970556 ] Apache Spark commented on SPARK-20351: -- User 'hhbyyh' has created a pull request for

[jira] [Assigned] (SPARK-20351) Add trait hasTrainingSummary to replace the duplicate code

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20351: Assignee: (was: Apache Spark) > Add trait hasTrainingSummary to replace the duplicate

[jira] [Assigned] (SPARK-20351) Add trait hasTrainingSummary to replace the duplicate code

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20351: Assignee: Apache Spark > Add trait hasTrainingSummary to replace the duplicate code >

[jira] [Updated] (SPARK-20338) Spaces in spark.eventLog.dir are not correctly handled

2017-04-16 Thread zuotingbing (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zuotingbing updated SPARK-20338: Priority: Major (was: Minor) > Spaces in spark.eventLog.dir are not correctly handled > --

[jira] [Comment Edited] (SPARK-20336) spark.read.csv() with wholeFile=True option fails to read non ASCII unicode characters

2017-04-16 Thread HanCheol Cho (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970259#comment-15970259 ] HanCheol Cho edited comment on SPARK-20336 at 4/17/17 1:18 AM:

[jira] [Commented] (SPARK-20156) Java String toLowerCase "Turkish locale bug" causes Spark problems

2017-04-16 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970581#comment-15970581 ] Apache Spark commented on SPARK-20156: -- User 'gatorsmile' has created a pull request

[jira] [Closed] (SPARK-20346) sum aggregate over empty Dataset gives null

2017-04-16 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li closed SPARK-20346. --- Resolution: Not A Problem > sum aggregate over empty Dataset gives null > ---

[jira] [Commented] (SPARK-20346) sum aggregate over empty Dataset gives null

2017-04-16 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970612#comment-15970612 ] Xiao Li commented on SPARK-20346: - {quote} The result of the COUNT and COUNT_BIG function

[jira] [Commented] (SPARK-20299) NullPointerException when null and string are in a tuple while encoding Dataset

2017-04-16 Thread Umesh Chaudhary (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970652#comment-15970652 ] Umesh Chaudhary commented on SPARK-20299: - My bad, previously I was indeed trying

[jira] [Created] (SPARK-20352) PySpark SparkSession initialization take longer every iteration in a single application

2017-04-16 Thread hosein (JIRA)
hosein created SPARK-20352: -- Summary: PySpark SparkSession initialization take longer every iteration in a single application Key: SPARK-20352 URL: https://issues.apache.org/jira/browse/SPARK-20352 Project:

[jira] [Updated] (SPARK-20352) PySpark SparkSession initialization take longer every iteration in a single application

2017-04-16 Thread hosein (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hosein updated SPARK-20352: --- Environment: linux ubuntu 12 pyspark was: linux ubunto 12 pyspark > PySpark SparkSession initialization

[jira] [Updated] (SPARK-20352) PySpark SparkSession initialization take longer every iteration in a single application

2017-04-16 Thread hosein (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hosein updated SPARK-20352: --- Environment: linux ubunto 12 spark 2.1 JRE 8.0 was: linux ubuntu 12 pyspark > PySpark SparkSession init

[jira] [Updated] (SPARK-20352) PySpark SparkSession initialization take longer every iteration in a single application

2017-04-16 Thread hosein (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hosein updated SPARK-20352: --- Environment: Ubuntu 12 Spark 2.1 JRE 8.0 Python 2.7 was: linux ubunto 12 spark 2.1 JRE 8.0 > PySpark S

[jira] [Updated] (SPARK-20352) PySpark SparkSession initialization take longer every iteration in a single application

2017-04-16 Thread hosein (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hosein updated SPARK-20352: --- Description: I run Spark on a standalone Ubuntu server with 128G memory and 32-core CPU. Run spark-sumbit my