[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store

2015-11-17 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15009033#comment-15009033
 ] 

Cheng Lian commented on SPARK-9686:
---

Tested 1.7-SNAPSHOT 
([fa13301|https://github.com/apache/spark/commit/fa13301ae440c4c9594280f236bcca11b62fdd29])
 under several different configurations using Hive 1.2.1 and a small JDBC 
testing program (attached at the end).

# Embedded metastore
  Remove {{conf/hive-site.xml}}, start Thrift server using 
{{./sbin/start-thriftserver.sh}}, and execute the test program
#- {{getSchemas()}} only returns {{default}}
#- {{getColumns}} returns nothing.
# Local metastore
  Configuring {{conf/hive-site.xml}} to point to a local PostgreSQL backed Hive 
1.2.1 metastore.
  Leave {{hive.metastore.uris}} empty (i.e. disabling remote metastore). Start 
Thrift server and execute the test program.
#- {{getSchemas()}} only returns {{default}}
#- {{getColumns}} returns nothing.
# Remote metastore
  Configuring {{conf/hive-site.xml}} to point to a remote PostgreSQL backed 
Hive 1.2.1 metastore.
  Set {{hive.metastore.uris}} to {{thrift://localhost:9083}}. Start metastore 
service using {{$HIVE_HOME/bin/hive --service metastore}}, start Thrift server, 
and execute the test program.
#- {{getSchemas()}} returns all defined databases.
#- {{getColumns}} returns columns defined in all tables.

However, it doesn't imply that using remote metastore works around this issue.  
After some investigation, I think there are two separate but related issues:
# In {{HiveThriftServer2}}, although all SQL commands are dispatched to 
metadata Hive client and execution Hive client properly, conventional JDBC 
calls are still using the default {{HiveServer2}} implementation (e.g. 
{{getSchemas()}} is handled by 
{{o.a.hive.service.cli.CLIService.getSchemas()}}). These calls are not 
dispatched and are always executed by execution Hive client, which points to 
the dummy local Derby metastore.
  We should override corresponding methods in {{SparkSQLCLIService}} and 
dispatch these JDBC calls to the metastore Hive client.
# When using remote metastore, execution Hive client somehow is initialized to 
point to the actual remote metastore instead of the dummy local Derby metastore.
  I haven't figured out the root cause, but single-step debugging shows that 
the execution Hive client does point to the remote metastore. My guess is that, 
{{hive.metastore.uris}} takes a high precedence than 
{{javax.jdo.option.ConnectionURL}}, and overrides the latter when a {{Hive}} 
object is being initialized.
  It's because of this issue that the 3rd test mentioned above shows the 
correct answer.  This issue can be steadily reproduced on my local machine. 
However, according to [~navis]'s comment, remote metastore didn't work for him 
either, probably because of other environmental factors.
  Filing a separate JIRA ticket for this one.

The JDBC testing program:

{code}
import java.sql.DriverManager

object JDBCExperiments {
  def main(args: Array[String]) {
val url = "jdbc:hive2://localhost:1/default"
val username = "lian"
val password = ""

try {
  Class.forName("org.apache.hive.jdbc.HiveDriver")
  val connection = DriverManager.getConnection(url, username, password)
  val metadata = connection.getMetaData
  val schema = metadata.getSchemas()

  while (schema.next()) {
val (key, value) = (schema.getString(1), schema.getString(2))
println(s"$key: $value")
  }

  val tables = metadata.getTables(null, null, null, null)
  while (tables.next()) {
val fields = Array.tabulate(5) { i =>
  tables.getString(i + 1)
}
println(fields.mkString(", "))
  }

  val columns = metadata.getColumns(null, null, null, null)
  while (columns.next()) {
println((columns.getString(3), columns.getString(4), 
columns.getString(6)))
  }
}
  }
}
{code}


> Spark hive jdbc client cannot get table from metadata store
> ---
>
> Key: SPARK-9686
> URL: https://issues.apache.org/jira/browse/SPARK-9686
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1
>Reporter: pin_zhang
>Assignee: Cheng Lian
> Attachments: SPARK-9686.1.patch.txt
>
>
> 1. Start  start-thriftserver.sh
> 2. connect with beeline
> 3. create table
> 4.show tables, the new created table returned
> 5.
>   Class.forName("org.apache.hive.jdbc.HiveDriver");
>   String URL = "jdbc:hive2://localhost:1/default";
>Properties info = new Properties();
> Connection conn = DriverManager.getConnection(URL, info);
>   ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(),
>null, null, null);
> Problem:
>No tables with returned 

[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store

2015-11-16 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15006352#comment-15006352
 ] 

Navis commented on SPARK-9686:
--

[~lian cheng] It's configured with remote database (maria-db in my case) of 
course. But those are overwritten to derby values when SparkSQLEnv is 
initialized. I've just overwritten again it with values in metadataHive before 
running jdbc commands. I cannot imagine why it's so badly twisted around spark 
thriftserver.

> Spark hive jdbc client cannot get table from metadata store
> ---
>
> Key: SPARK-9686
> URL: https://issues.apache.org/jira/browse/SPARK-9686
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1
>Reporter: pin_zhang
>Assignee: Cheng Lian
> Attachments: SPARK-9686.1.patch.txt
>
>
> 1. Start  start-thriftserver.sh
> 2. connect with beeline
> 3. create table
> 4.show tables, the new created table returned
> 5.
>   Class.forName("org.apache.hive.jdbc.HiveDriver");
>   String URL = "jdbc:hive2://localhost:1/default";
>Properties info = new Properties();
> Connection conn = DriverManager.getConnection(URL, info);
>   ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(),
>null, null, null);
> Problem:
>No tables with returned this API, that work in spark1.3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store

2015-11-11 Thread Cheng Lian (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15000215#comment-15000215
 ] 

Cheng Lian commented on SPARK-9686:
---

[~navis] [~bugg_tb] [~pin_zhang] May I ask were you all using embedded or local 
metastore? Namely, {{hive.metastore.uris}} is configured to be empty?

> Spark hive jdbc client cannot get table from metadata store
> ---
>
> Key: SPARK-9686
> URL: https://issues.apache.org/jira/browse/SPARK-9686
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.0, 1.4.1, 1.5.0, 1.5.1
>Reporter: pin_zhang
>Assignee: Cheng Lian
> Attachments: SPARK-9686.1.patch.txt
>
>
> 1. Start  start-thriftserver.sh
> 2. connect with beeline
> 3. create table
> 4.show tables, the new created table returned
> 5.
>   Class.forName("org.apache.hive.jdbc.HiveDriver");
>   String URL = "jdbc:hive2://localhost:1/default";
>Properties info = new Properties();
> Connection conn = DriverManager.getConnection(URL, info);
>   ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(),
>null, null, null);
> Problem:
>No tables with returned this API, that work in spark1.3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store

2015-11-05 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992767#comment-14992767
 ] 

Navis commented on SPARK-9686:
--

[~pin_zhang] Met the same problem and the patch attached is what I'm using 
(rebased on master branch). Not implemented in clean way as data bricks would 
want to be included but worked for me.

> Spark hive jdbc client cannot get table from metadata store
> ---
>
> Key: SPARK-9686
> URL: https://issues.apache.org/jira/browse/SPARK-9686
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.4.0, 1.4.1
>Reporter: pin_zhang
>Assignee: Cheng Lian
> Attachments: SPARK-9686.1.patch.txt
>
>
> 1. Start  start-thriftserver.sh
> 2. connect with beeline
> 3. create table
> 4.show tables, the new created table returned
> 5.
>   Class.forName("org.apache.hive.jdbc.HiveDriver");
>   String URL = "jdbc:hive2://localhost:1/default";
>Properties info = new Properties();
> Connection conn = DriverManager.getConnection(URL, info);
>   ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(),
>null, null, null);
> Problem:
>No tables with returned this API, that work in spark1.3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store

2015-08-20 Thread pin_zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14704374#comment-14704374
 ] 

pin_zhang commented on SPARK-9686:
--

What's the status of this bug? will it be fixed in 1.4.x?

 Spark hive jdbc client cannot get table from metadata store
 ---

 Key: SPARK-9686
 URL: https://issues.apache.org/jira/browse/SPARK-9686
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.0, 1.4.1
Reporter: pin_zhang
Assignee: Cheng Lian

 1. Start  start-thriftserver.sh
 2. connect with beeline
 3. create table
 4.show tables, the new created table returned
 5.
   Class.forName(org.apache.hive.jdbc.HiveDriver);
   String URL = jdbc:hive2://localhost:1/default;
Properties info = new Properties();
 Connection conn = DriverManager.getConnection(URL, info);
   ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(),
null, null, null);
 Problem:
No tables with returned this API, that work in spark1.3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-9686) Spark hive jdbc client cannot get table from metadata store

2015-08-14 Thread Tom Barber (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14697881#comment-14697881
 ] 

Tom Barber commented on SPARK-9686:
---

Spent all afternoon trying to locate this nugget of information! Certainly 
doesn't work but does in 1.3 which obviously impacts a bunch of data vis tools.

 Spark hive jdbc client cannot get table from metadata store
 ---

 Key: SPARK-9686
 URL: https://issues.apache.org/jira/browse/SPARK-9686
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.4.0, 1.4.1
Reporter: pin_zhang
Assignee: Cheng Lian

 1. Start  start-thriftserver.sh
 2. connect with beeline
 3. create table
 4.show tables, the new created table returned
 5.
   Class.forName(org.apache.hive.jdbc.HiveDriver);
   String URL = jdbc:hive2://localhost:1/default;
Properties info = new Properties();
 Connection conn = DriverManager.getConnection(URL, info);
   ResultSet tables = conn.getMetaData().getTables(conn.getCatalog(),
null, null, null);
 Problem:
No tables with returned this API, that work in spark1.3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org