Re: Create external table

Margus Roo Wed, 11 May 2016 02:29:00 -0700

What do you mean?

Margus (margusja) Roo
http://margus.roo.ee
skype: margusja
+372 51 48 780


On 11/05/16 08:21, Mich Talebzadeh wrote:

yes but table then exists correct I mean second time

did you try


*use default;*
*
drop table if exists trips;*
**


it is still within Hive metadata registered as an existing table.
*
*


*



*

Dr Mich Talebzadeh

LinkedIn/https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw/


http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>

On 11 May 2016 at 06:16, Margus Roo <mar...@roo.ee<mailto:mar...@roo.ee>> wrote:


    Hi

    Thanks for your answer.

    ---

    At first I create an empty hdfs directory (if directory is empty I
    did not have problems before too).

    [margusja@hadoopnn1 ~]$ hdfs dfs -mkdir /user/margusja/trips

    [margusja@hadoopnn1 ~]$ beeline -f create_externat_table_trips.hql
    -u
    
"jdbc:hive2://hadoopnn1.example.com:2181,hadoopnn2.example.com:2181,hadoopdn1.example.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@testhadoop.com"
    
<mailto:jdbc:hive2://hadoopnn1.example.com:2181,hadoopnn2.example.com:2181,hadoopdn1.example.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@testhadoop.com>
    --verbose
    WARNING: Use "yarn jar" to launch YARN applications.
    issuing: !connect jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hadoopnn2.example.com:2181
    
<http://hadoopnn2.example.com:2181>,hadoopdn1.example.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@testhadoop.com
    
<mailto:hadoopdn1.example.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@testhadoop.com>
    '' [passwd stripped]
    Connecting to jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hadoopnn2.example.com:2181
    
<http://hadoopnn2.example.com:2181>,hadoopdn1.example.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@testhadoop.com
    
<mailto:hadoopdn1.example.com:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@testhadoop.com>
    Connected to: Apache Hive (version 1.2.1.2.3.4.0-3485)
    Driver: Hive JDBC (version 1.2.1.2.3.4.0-3485)
    Transaction isolation: TRANSACTION_REPEATABLE_READ
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> use default;
    Getting log thread is interrupted, since query is done!
    No rows affected (1.225 seconds)
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> drop table if exists trips;
    Getting log thread is interrupted, since query is done!
    No rows affected (0.159 seconds)
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> CREATE EXTERNAL TABLE
    `TRIPS`(
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> `bike_nr` string,
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> `duration` int,
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> `start_date` string,
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> `start_station` string,
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> `end_station` string)
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> PARTITIONED BY (
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> `year` int,
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> `month` string)
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> ROW FORMAT DELIMITED
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> FIELDS TERMINATED BY ','
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> LINES TERMINATED BY '\n'
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> STORED AS INPUTFORMAT
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado>
    'org.apache.hadoop.mapred.TextInputFormat'
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> OUTPUTFORMAT
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado>
    'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> LOCATION
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> '/user/margusja/trips';
    Getting log thread is interrupted, since query is done!
    No rows affected (0.067 seconds)
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> desc trips;
    Getting log thread is interrupted, since query is done!
    
+--------------------------+-----------------------+-----------------------+--+
    |         col_name         |       data_type |        comment        |
    
+--------------------------+-----------------------+-----------------------+--+
    | bike_nr                  | string |                       |
    | duration                 | int |                       |
    | start_date               | string |                       |
    | start_station            | string |                       |
    | end_station              | string |                       |
    | year                     | int |                       |
    | month                    | string |                       |
    |                          | NULL                  |
    NULL                  |
    | # Partition Information  | NULL                  |
    NULL                  |
    | # col_name               | data_type             |
    comment               |
    |                          | NULL                  |
    NULL                  |
    | year                     | int |                       |
    | month                    | string |                       |
    
+--------------------------+-----------------------+-----------------------+--+
    13 rows selected (0.46 seconds)
    0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hado> !exit
    Closing: 0: jdbc:hive2://hadoopnn1.example.com:2181
    <http://hadoopnn1.example.com:2181>,hadoopnn2.example.com:2181
    
<http://hadoopnn2.example.com:2181>,hadoopdn1.host.ee:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@testhadoop.com
    
<mailto:hadoopdn1.host.ee:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/_h...@testhadoop.com>


    But In case I try:

     CREATE EXTERNAL TABLE default.st1_test_margusja (
      original STRING,
      rsyslog_timestamp STRING,
      rsyslog_url STRING,
      rsyslog_appname STRING,
      rsyslog_pos5 STRING,
      ts STRING,
      url STRING,
      username STRING,
        ip_address STRING,
        log_level STRING,
        content STRING
      )
      COMMENT 'Dealgate logs raw data'
      PARTITIONED BY (year STRING, month STRING, day STRING)
      ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
      WITH SERDEPROPERTIES ('input.regex'='([^\\s]+ ([^\\s]+)
    ([^\\s]+) ([^\\s]+) ([^\\s]+) [^\\s]+
    
[^\\s]+[\\s]{1,2}([^\\|]+)\\|([^\\|]+)\\|([^\\+]*)\\+([^\\|]*)\\|([^\\|]+)\\|(.+))'
    )
      STORED AS TEXTFILE
      LOCATION '/datasource/test/';

    Now I am starting to see loads of GC warnings in hiveserver2 log
    and sometomes beeline ends: Error:
    org.apache.thrift.transport.TTransportException (state=08S01,code=0)

    [hdfs@hadoopnn1 ~]$ hdfs dfs -count -h /datasource/test/
              53        7.2 K              8.4 G /datasource/test

    And now I use hive:

    [margusja@hadoopnn1 ~]$ hive
    WARNING: Use "yarn jar" to launch YARN applications.
    log4j:WARN No such property [maxBackupIndex] in
    org.apache.log4j.DailyRollingFileAppender.

    Logging initialized using configuration in
    file:/etc/hive/2.3.4.0-3485/0/hive-log4j.properties
    hive>  CREATE EXTERNAL TABLE default.st1_test_margusja (
        >   original STRING,
        >   rsyslog_timestamp STRING,
        >   rsyslog_url STRING,
        >   rsyslog_appname STRING,
        >   rsyslog_pos5 STRING,
        >   ts STRING,
        >   url STRING,
        >   username STRING,
        >     ip_address STRING,
        >     log_level STRING,
        >     content STRING
        >   )
        >   COMMENT 'Dealgate logs raw data'
        >   PARTITIONED BY (year STRING, month STRING, day STRING)
        >   ROW FORMAT SERDE
    'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
        >   WITH SERDEPROPERTIES ('input.regex'='([^\\s]+ ([^\\s]+)
    ([^\\s]+) ([^\\s]+) ([^\\s]+) [^\\s]+
    
[^\\s]+[\\s]{1,2}([^\\|]+)\\|([^\\|]+)\\|([^\\+]*)\\+([^\\|]*)\\|([^\\|]+)\\|(.+))'
    )
        >   STORED AS TEXTFILE
        >   LOCATION '/datasource/test/';
    OK
    Time taken: 1.254 seconds
    hive>


    Any hint is welcome



    Margus (margusja) Roo
    http://margus.roo.ee
    skype: margusja
    +372 51 48 780 <tel:%2B372%2051%2048%20780>

    On 11/05/16 02:16, Mich Talebzadeh wrote:

    *use default;
    drop table if exists trips;
    CREATE EXTERNAL TABLE `TRIPS`(
      `bike_nr` string,
      `duration` int,
      `start_date` string,
      `start_station` string,
      `end_station` string)
    PARTITIONED BY (
      `year` int,
      `month` string)
    ROW FORMAT DELIMITED
      FIELDS TERMINATED BY ','
      LINES TERMINATED BY '\n'
    STORED AS INPUTFORMAT
      'org.apache.hadoop.mapred.TextInputFormat'
    OUTPUTFORMAT
    'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    LOCATION
      '/test/text/trips';*

Re: Create external table

Reply via email to