[ https://issues.apache.org/jira/browse/ARROW-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261673#comment-16261673 ]
ASF GitHub Bot commented on ARROW-1268: --------------------------------------- wesm closed pull request #1344: ARROW-1268: [SITE][FOLLOWUP] Update Spark Post to Reflect Conf Change URL: https://github.com/apache/arrow/pull/1344 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/site/_posts/2017-07-26-spark-arrow.md b/site/_posts/2017-07-26-spark-arrow.md index c4b16c073..211e5a481 100644 --- a/site/_posts/2017-07-26-spark-arrow.md +++ b/site/_posts/2017-07-26-spark-arrow.md @@ -57,7 +57,7 @@ the conversion to Arrow data can be done on the JVM and pushed back for the Spar executors to perform in parallel, drastically reducing the load on the driver. As of the merging of [SPARK-13534][5], the use of Arrow when calling `toPandas()` -needs to be enabled by setting the SQLConf "spark.sql.execution.arrow.enable" to +needs to be enabled by setting the SQLConf "spark.sql.execution.arrow.enabled" to "true". Let's look at a simple usage example. ``` @@ -84,7 +84,7 @@ In [2]: %time pdf = df.toPandas() CPU times: user 17.4 s, sys: 792 ms, total: 18.1 s Wall time: 20.7 s -In [3]: spark.conf.set("spark.sql.execution.arrow.enable", "true") +In [3]: spark.conf.set("spark.sql.execution.arrow.enabled", "true") In [4]: %time pdf = df.toPandas() CPU times: user 40 ms, sys: 32 ms, total: 72 ms @@ -118,7 +118,7 @@ It is planned to add pyarrow as a pyspark dependency so that Currently, the controlling SQLConf is disabled by default. This can be enabled programmatically as in the example above or by adding the line -"spark.sql.execution.arrow.enable=true" to `SPARK_HOME/conf/spark-defaults.conf`. +"spark.sql.execution.arrow.enabled=true" to `SPARK_HOME/conf/spark-defaults.conf`. Also, not all Spark data types are currently supported and limited to primitive types. Expanded type support is in the works and expected to also be in the Spark ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > [Website] Blog post on Arrow integration with Spark > --------------------------------------------------- > > Key: ARROW-1268 > URL: https://issues.apache.org/jira/browse/ARROW-1268 > Project: Apache Arrow > Issue Type: New Feature > Components: Website > Reporter: Bryan Cutler > Assignee: Bryan Cutler > Priority: Minor > Labels: pull-request-available > Fix For: 0.6.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)