date:20171106

Which predicate pushdown work or does not work with Parquet?

2017-11-06 Thread Manuel Vonthron

Hi all, I am trying to determine which predicate pushdown work or does not work with Spark+Parquet (mostly for versions 2.1.0 and/or 2.2.0). I've read a lot of messages from the pull requests comments, JIRA tickets, even the comments in Parquet's source but it's hard to have a clear picture of

Re: spark-avro aliases incompatible

2017-11-06 Thread Gourav Sengupta

Hi, I may be wrong about this, but when you are using format("") you are basically using old SPARK classes, which still exists because of backward compatibility. Please refer to the following documentation to take advantage of the recent changes in SPARK:

Re: Structured Stream equivalent of reduceByKey

2017-11-06 Thread Michael Armbrust

Hmmm, I see. You could output the delta using flatMapGroupsWithState

A pyspark sql query

2017-11-06 Thread paulgureghian

are the min,max, and mean functions correct ? -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail:

Re: spark-avro aliases incompatible

2017-11-06 Thread Gaspar Muñoz

Of course, right now I'm trying in local with spark 2.2.0 and spark-avro 4.0.0. I've just uploaded a snippet https://gist.github.com/gasparms/5d0740bd61a500357e0230756be963e1 Basically, my avro schema has a field with an alias and in the last part of code spark-avro is not able to read old data

pySpark driver memory limit

2017-11-06 Thread Nicolas Paris

hi there Can anyone clarify the driver memory aspects of pySpark? According to [1], spark.driver.memory limits JVM + python memory. In case: spark.driver.memory=2G Then does it mean the user won't be able to use more than 2G, whatever the python code + the RDD stuff he is using ? Thanks, [1]:

Building Spark with hive 1.1.0

2017-11-06 Thread HARSH TAKKAR

Hi I am using the cloudera (cdh5.11.0) setup, which have the hive version as 1.1.0, but when i build spark with hive and thrift support it pack the hive version as 1.6.0, Please let me know how can i build spark with hive 1.1.0 ? command i am using to build : ./dev/make-distribution.sh --name

Which predicate pushdown work or does not work with Parquet?

Re: spark-avro aliases incompatible

Re: Structured Stream equivalent of reduceByKey

A pyspark sql query

Re: spark-avro aliases incompatible

pySpark driver memory limit

Building Spark with hive 1.1.0

7 matches

Site Navigation

Mail list logo

Footer information