I have a different opinion on this. Usually, in production deployments
(atleast whatever I am aware of), database is generally managed at the
org/group level.  Privacy policies like ACLs are usually done at database
level and would need first level management by admins. With such a setup,
its feels safer to let database creation done through separate process and
let hudi hive sync only  alter/create tables (current setup).

Open to hearing other's thoughts.

Regards,
Balaji.V

On Wed, Nov 6, 2019 at 12:01 PM Bhavani Sudha <bhavanisud...@gmail.com>
wrote:

> +1 I think we should create db if it does not exist.
>
> On Tue, Nov 5, 2019 at 11:08 PM Pratyaksh Sharma <pratyaks...@gmail.com>
> wrote:
>
> > Hi,
> >
> > While doing hive sync using HiveSyncTool, we first check if the target
> > table exists in hive. If not, we try to create it. However in this flow,
> if
> > the database itself does not exist, we do not create the database before
> > creating hive table, which results in exception like below -
> >
> > org.apache.hive.service.cli.HiveSQLException: Error while compiling
> > statement: FAILED: SemanticException [Error 10072]: Database does not
> > exist: test_db
> > at
> >
> >
> org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:335)
> > at
> >
> >
> org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:199)
> > at
> >
> >
> org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262)
> > at
> org.apache.hive.service.cli.operation.Operation.run(Operation.java:247)
> > at
> >
> >
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:575)
> > at
> >
> >
> org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:561)
> > at sun.reflect.GeneratedMethodAccessor108.invoke(Unknown Source)
> > at
> >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498)
> > at
> >
> >
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:78)
> > at
> >
> >
> org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:36)
> > at
> >
> >
> org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:63)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:422)
> > at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> > at
> >
> >
> org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:59)
> > at com.sun.proxy.$Proxy68.executeStatementAsync(Unknown Source)
> > at
> >
> >
> org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315)
> > at
> >
> >
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:566)
> > at
> >
> >
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557)
> > at
> >
> >
> org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542)
> > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> > at
> >
> >
> org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56)
> > at
> >
> >
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
> > ... 3 more
> > Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Database
> does
> > not exist: test_db
> > at
> >
> >
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.getDatabase(BaseSemanticAnalyzer.java:2154)
> >
> >
> > So just wanted to discuss if we should try creating database first in
> above
> > case using query like -
> >
> > CREATE DATABASE|SCHEMA [IF NOT EXISTS] <database name>
> >
>

Reply via email to