date:20180113

Re: Distinct on Map data type -- SPARK-19893

2018-01-13 Thread ckhari4u

Wan, Thanks a lot,! I see the issue now. Do we have any JIRA's open for the future work to be done on this? -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@sp

Re: [VOTE] Spark 2.3.0 (RC1)

2018-01-13 Thread Sean Owen

The signatures and licenses look OK. Except for the missing k8s package, the contents look OK. Tests look pretty good with "-Phive -Phadoop-2.7 -Pyarn" on Ubuntu 17.10, except that KafkaContinuousSourceSuite seems to hang forever. That was just fixed and needs to get into an RC? Aside from the Blo

transformSchema method policy for "duplicated" column names

2018-01-13 Thread Alessandro Solimando

Hello everyone, after one month without any reply on stackoverflow ( https://stackoverflow.com/questions/47789265/inconsistency-in-handling-duplicate-names-in-dataframe-schema) I try to pose the question here. Context: I am refactoring some code of mine, transforming scala methods with a signature

Join Strategies

2018-01-13 Thread Marco Gaido

Hi dev, I have a question about how join strategies are defined. I see that CartesianProductExec is used only for InnerJoin, while for other kind of joins BroadcastNestedLoopJoinExec is used. For reference: https://github.com/apache/spark/blob/cd9f49a2aed3799964976ead06080a0f7044a0c3/sql/core/src

Remove or rename? What does ResolvedDataSourceSuite test?

2018-01-13 Thread Jacek Laskowski

Hi, It looks like ResolvedDataSourceSuite [1] is a left-over (after ResolveDataSource?). If not to be deleted, ResolvedDataSourceSuite should surely be renamed. Correct? [1] https://github.com/apache/spark/blob/master/sql/core/src/test/scala/org/apache/spark/sql/sources/ResolvedDataSourceSuite.s

Re: Compiling Spark UDF at runtime

2018-01-13 Thread Michael Shtelma

Thanks! yes, this would be an option of course. HDFS or Alluxio. Sincerely, Michael Shtelma On Fri, Jan 12, 2018 at 3:26 PM, Georg Heiler wrote: > You could store the jar in hdfs. Then even in yarn cluster mode your give > workaround should work. > Michael Shtelma schrieb am Fr. 12. Jan. 2018 u

Re: Distinct on Map data type -- SPARK-19893

2018-01-13 Thread Wenchen Fan

A very simple example is sql("select create_map(1, 'a', 2, 'b')") .union(sql("select create_map(2, 'b', 1, 'a')")) .distinct By definition a map should not care about the order of its entries, so the above query should return one record. However it returns 2 records before SPARK-19893 On Sat,

Re: Distinct on Map data type -- SPARK-19893

Re: [VOTE] Spark 2.3.0 (RC1)

transformSchema method policy for "duplicated" column names

Join Strategies

Remove or rename? What does ResolvedDataSourceSuite test?

Re: Compiling Spark UDF at runtime

Re: Distinct on Map data type -- SPARK-19893

7 matches

Site Navigation

Mail list logo

Footer information