[GitHub] spark pull request: [SPARK-2410][SQL][WIP] Cherry picked Hive Thri...

liancheng Sun, 20 Jul 2014 23:16:18 -0700

Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/1399#discussion_r15155961
  
    --- Diff: docs/sql-programming-guide.md ---
    @@ -573,4 +572,170 @@ prefixed with a tick (`'`).  Implicit conversions 
turn these symbols into expres
     evaluated by the SQL execution engine.  A full list of the functions 
supported can be found in the
     [ScalaDoc](api/scala/index.html#org.apache.spark.sql.SchemaRDD).
     
    -<!-- TODO: Include the table of operations here. -->
    \ No newline at end of file
    +<!-- TODO: Include the table of operations here. -->
    +
    +## Running the Thrift JDBC server
    +
    +The Thrift JDBC server implemented here corresponds to the [`HiveServer2`]
    +(https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2) 
in Hive 0.12. You can test
    +the JDBC server with the beeline script comes with either Spark or Hive 
0.12.
    +
    +To start the JDBC server, run the following in the Spark directory:
    +
    +    ./sbin/start-thriftserver.sh
    +
    +The default port the server listens on is 10000.  Now you can use beeline 
to test the Thrift JDBC
    +server:
    +
    +    ./bin/beeline
    +
    +Connect to the JDBC server in beeline with:
    +
    +    beeline> !connect jdbc:hive2://localhost:10000
    +
    +Beeline will ask you for a username and password. In non-secure mode, 
simply enter the username on
    +your machine and a blank password. For secure mode, please follow the 
instructions given in the
    +[beeline 
documentation](https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients)
    +
    +Configuration of Hive is done by placing your `hive-site.xml` file in 
`conf/`.
    +
    +You may also use the beeline script comes with Hive.
    +
    +### Migration Guide for Shark Users
    +
    +#### Reducer number
    +
    +In Shark, default reducer number is 1, and can be tuned by property 
`mapred.reduce.tasks`. In Spark SQL, reducer number is default to 200, and can 
be customized by the `spark.sql.shuffle.partitions` property:
    --- End diff --
    
    Seems to be a good idea. Would also add a WARN log telling user this 
property is deprecated.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] spark pull request: [SPARK-2410][SQL][WIP] Cherry picked Hive Thri...

Reply via email to