[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store
[ https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15009033#comment-15009033 ] Cheng Lian commented on SPARK-9686: --- Tested 1.7-SNAPSHOT ([fa13301|https://github.com/apache/spark/commit/fa13301ae440c4c9594280f236bcca11b62fdd29]) under several different configurations using Hive 1.2.1 and a small JDBC testing program (attached at the end). # Embedded metastore Remove {{conf/hive-site.xml}}, start Thrift server using {{./sbin/start-thriftserver.sh}}, and execute the test program #- {{getSchemas()}} only returns {{default}} #- {{getColumns}} returns nothing. # Local metastore Configuring {{conf/hive-site.xml}} to point to a local PostgreSQL backed Hive 1.2.1 metastore. Leave {{hive.metastore.uris}} empty (i.e. disabling remote metastore). Start Thrift server and execute the test program. #- {{getSchemas()}} only returns {{default}} #- {{getColumns}} returns nothing. # Remote metastore Configuring {{conf/hive-site.xml}} to point to a remote PostgreSQL backed Hive 1.2.1 metastore. Set {{hive.metastore.uris}} to {{thrift://localhost:9083}}. Start metastore service using {{$HIVE_HOME/bin/hive --service metastore}}, start Thrift server, and execute the test program. #- {{getSchemas()}} returns all defined databases. #- {{getColumns}} returns columns defined in all tables. However, it doesn't imply that using remote metastore works around this issue. After some investigation, I think there are two separate but related issues: # In {{HiveThriftServer2}}, although all SQL commands are dispatched to metadata Hive client and execution Hive client properly, conventional JDBC calls are still using the default {{HiveServer2}} implementation (e.g. {{getSchemas()}} is handled by {{o.a.hive.service.cli.CLIService.getSchemas()}}). These calls are not dispatched and are always executed by execution Hive client, which points to the dummy local Derby metastore. We should override corresponding methods in {{SparkSQLCLIService}} and dispatch these JDBC calls to the metastore Hive client. # When using remote metastore, execution Hive client somehow is initialized to point to the actual remote metastore instead of the dummy local Derby metastore. I haven't figured out the root cause, but single-step debugging shows that the execution Hive client does point to the remote metastore. My guess is that, {{hive.metastore.uris}} takes a high precedence than {{javax.jdo.option.ConnectionURL}}, and overrides the latter when a {{Hive}} object is being initialized. It's because of this issue that the 3rd test mentioned above shows the correct answer. This issue can be steadily reproduced on my local machine. However, according to [~navis]'s comment, remote metastore didn't work for him either, probably because of other environmental factors. Filing a separate JIRA ticket for this one. The JDBC testing program: {code} import java.sql.DriverManager object JDBCExperiments { def main(args: Array[String]) { val url = "jdbc:hive2://localhost:1/default" val username = "lian" val password = "" try { Class.forName("org.apache.hive.jdbc.HiveDriver") val connection = DriverManager.getConnection(url, username, password) val metadata = connection.getMetaData val schema = metadata.getSchemas() while (schema.next()) { val (key, value) = (schema.getString(1), schema.getString(2)) println(s"$key: $value") } val tables = metadata.getTables(null, null, null, null) while (tables.next()) { val fields = Array.tabulate(5) { i => tables.getString(i + 1) } println(fields.mkString(", ")) } val columns = metadata.getColumns(null, null, null, null) while (columns.next()) { println((columns.getString(3), columns.getString(4), columns.getString(6))) } } } } {code} > Spark hive jdbc client cannot get table from metadata store > --- > > Key: SPARK-9686 > URL: https://issues.apache.org/jira/browse/SPARK-9686 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1 >Reporter: pin_zhang >Assignee: Cheng Lian > Attachments: SPARK-9686.1.patch.txt > > > 1. Start start-thriftserver.sh > 2. connect with beeline > 3. create table > 4.show tables, the new created table returned > 5. > Class.forName("org.apache.hive.jdbc.HiveDriver"); > String URL = "jdbc:hive2://localhost:1/default"; >Properties info = new Properties(); > Connection conn = DriverManager.getConnection(URL, info); > ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(), >null, null, null); > Problem: >No tables with returned
[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store
[ https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15006352#comment-15006352 ] Navis commented on SPARK-9686: -- [~lian cheng] It's configured with remote database (maria-db in my case) of course. But those are overwritten to derby values when SparkSQLEnv is initialized. I've just overwritten again it with values in metadataHive before running jdbc commands. I cannot imagine why it's so badly twisted around spark thriftserver. > Spark hive jdbc client cannot get table from metadata store > --- > > Key: SPARK-9686 > URL: https://issues.apache.org/jira/browse/SPARK-9686 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1 >Reporter: pin_zhang >Assignee: Cheng Lian > Attachments: SPARK-9686.1.patch.txt > > > 1. Start start-thriftserver.sh > 2. connect with beeline > 3. create table > 4.show tables, the new created table returned > 5. > Class.forName("org.apache.hive.jdbc.HiveDriver"); > String URL = "jdbc:hive2://localhost:1/default"; >Properties info = new Properties(); > Connection conn = DriverManager.getConnection(URL, info); > ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(), >null, null, null); > Problem: >No tables with returned this API, that work in spark1.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store
[ https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15000215#comment-15000215 ] Cheng Lian commented on SPARK-9686: --- [~navis] [~bugg_tb] [~pin_zhang] May I ask were you all using embedded or local metastore? Namely, {{hive.metastore.uris}} is configured to be empty? > Spark hive jdbc client cannot get table from metadata store > --- > > Key: SPARK-9686 > URL: https://issues.apache.org/jira/browse/SPARK-9686 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1 >Reporter: pin_zhang >Assignee: Cheng Lian > Attachments: SPARK-9686.1.patch.txt > > > 1. Start start-thriftserver.sh > 2. connect with beeline > 3. create table > 4.show tables, the new created table returned > 5. > Class.forName("org.apache.hive.jdbc.HiveDriver"); > String URL = "jdbc:hive2://localhost:1/default"; >Properties info = new Properties(); > Connection conn = DriverManager.getConnection(URL, info); > ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(), >null, null, null); > Problem: >No tables with returned this API, that work in spark1.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store
[ https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992767#comment-14992767 ] Navis commented on SPARK-9686: -- [~pin_zhang] Met the same problem and the patch attached is what I'm using (rebased on master branch). Not implemented in clean way as data bricks would want to be included but worked for me. > Spark hive jdbc client cannot get table from metadata store > --- > > Key: SPARK-9686 > URL: https://issues.apache.org/jira/browse/SPARK-9686 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.4.0, 1.4.1 >Reporter: pin_zhang >Assignee: Cheng Lian > Attachments: SPARK-9686.1.patch.txt > > > 1. Start start-thriftserver.sh > 2. connect with beeline > 3. create table > 4.show tables, the new created table returned > 5. > Class.forName("org.apache.hive.jdbc.HiveDriver"); > String URL = "jdbc:hive2://localhost:1/default"; >Properties info = new Properties(); > Connection conn = DriverManager.getConnection(URL, info); > ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(), >null, null, null); > Problem: >No tables with returned this API, that work in spark1.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store
[ https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704374#comment-14704374 ] pin_zhang commented on SPARK-9686: -- What's the status of this bug? will it be fixed in 1.4.x? Spark hive jdbc client cannot get table from metadata store --- Key: SPARK-9686 URL: https://issues.apache.org/jira/browse/SPARK-9686 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.0, 1.4.1 Reporter: pin_zhang Assignee: Cheng Lian 1. Start start-thriftserver.sh 2. connect with beeline 3. create table 4.show tables, the new created table returned 5. Class.forName(org.apache.hive.jdbc.HiveDriver); String URL = jdbc:hive2://localhost:1/default; Properties info = new Properties(); Connection conn = DriverManager.getConnection(URL, info); ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(), null, null, null); Problem: No tables with returned this API, that work in spark1.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store
[ https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697881#comment-14697881 ] Tom Barber commented on SPARK-9686: --- Spent all afternoon trying to locate this nugget of information! Certainly doesn't work but does in 1.3 which obviously impacts a bunch of data vis tools. Spark hive jdbc client cannot get table from metadata store --- Key: SPARK-9686 URL: https://issues.apache.org/jira/browse/SPARK-9686 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.4.0, 1.4.1 Reporter: pin_zhang Assignee: Cheng Lian 1. Start start-thriftserver.sh 2. connect with beeline 3. create table 4.show tables, the new created table returned 5. Class.forName(org.apache.hive.jdbc.HiveDriver); String URL = jdbc:hive2://localhost:1/default; Properties info = new Properties(); Connection conn = DriverManager.getConnection(URL, info); ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(), null, null, null); Problem: No tables with returned this API, that work in spark1.3 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org