date:20150428

Re: ORC file across multiple HDFS blocks

2015-04-28 Thread Grant Overby (groverby)

Expanding on Alan’s post:

Files are intended to span many blocks and a single file may be read by many 
mappers. In order for a file to be read by many mappers, it goes through a 
process called input splits which splits the input around hdfs block boundaries.

If a unit of data within a file crosses a hdfs block, a portion of that unit of 
data must be sent from the node which contains block/mapper of one portion to 
the node that contains the block/mapper of the other portion. Take a csv file 
for example, in this case a unit of data is a line, and transferring a portion 
of a line between boxes is no big deal.

This changes a bit for orc files as the unit of data is a stripe. An orc stripe 
is typically a few hundred MB. Without some additional logic, a substantial 
part of data locality would be lost; however, orc has such additional logic. 
The stripe size of the orc file should be set a few MB below the hdfs block 
size and padding enable to produce a 1:1 relationship between an orc stripe and 
an hdfs block. How many stripes or blocks that are “in" a single file is of no 
consequence so long as this 1:1 relationship is maintained.

Below is an example config for 128mb hdfs blocks.

Configuration writerConf = new Configuration();
// other config
OrcFile.WriterOptions writerOptions = 
OrcFile.writerOptions(writerConf);
writerOptions.blockPadding(true);
writerOptions.stripeSize(122 * 1024 * 1024);
// other options
Writer writer = OrcFile.createWriter(path, writerOptions);



[http://www.cisco.com/web/europe/images/email/signature/est2014/logo_06.png?ct=1398192119726]

Grant Overby
Software Engineer
Cisco.com
grove...@cisco.com
Mobile: 865 724 4910






[http://www.cisco.com/assets/swa/img/thinkbeforeyouprint.gif] Think before you 
print.

This email may contain confidential and privileged material for the sole use of 
the intended recipient. Any review, use, distribution or disclosure by others 
is strictly prohibited. If you are not the intended recipient (or authorized to 
receive for the recipient), please contact the sender by reply email and delete 
all copies of this message.

Please click 
here for 
Company Registration Information.





From: Alan Gates mailto:alanfga...@gmail.com>>
Reply-To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Date: Monday, April 27, 2015 at 2:05 PM
To: "user@hive.apache.org" 
mailto:user@hive.apache.org>>
Subject: Re: ORC file across multiple HDFS blocks

to cross blocks and hence n

default number of reducers

2015-04-28 Thread Shushant Arora

In Normal MR job can I configure ( cluster wide) default number of reducers
- if I don't specify any reducers in my job.

Re: ORC file across multiple HDFS blocks

2015-04-28 Thread Demai Ni

Alan and Grant,

many thanks. Grant's comment is exact on the point that I am exploring.

A bit background here. I am working on a MPP way to read ORC files through
this C++ API (https://github.com/hortonworks/orc) by Owen and team. The MPP
mechanism is using one(or several) independent process per each HDFS node,
and work like a Client code to read ORC file(s). Currently, the assignment
of each process is scheduled at ORC file level, which would encounter the
issue of "lost of data locality" described by Grant.  I didn't realize that
we can make the scheduling at stripe-level.  Good to know that, which
surely make sense.

Demai

On Tue, Apr 28, 2015 at 8:34 AM, Grant Overby (groverby)  wrote:

>  Expanding on Alan’s post:
>
>  Files are intended to span many blocks and a single file may be read by
> many mappers. In order for a file to be read by many mappers, it goes
> through a process called input splits which splits the input around hdfs
> block boundaries.
>
>  If a unit of data within a file crosses a hdfs block, a portion of that
> unit of data must be sent from the node which contains block/mapper of one
> portion to the node that contains the block/mapper of the other portion.
> Take a csv file for example, in this case a unit of data is a line, and
> transferring a portion of a line between boxes is no big deal.
>
>  This changes a bit for orc files as the unit of data is a stripe. An orc
> stripe is typically a few hundred MB. Without some additional logic, a
> substantial part of data locality would be lost; however, orc has such
> additional logic. The stripe size of the orc file should be set a few MB
> below the hdfs block size and padding enable to produce a 1:1 relationship
> between an orc stripe and an hdfs block. How many stripes or blocks that
> are “in" a single file is of no consequence so long as this 1:1
> relationship is maintained.
>
>  Below is an example config for 128mb hdfs blocks.
>
>  Configuration writerConf = new Configuration();
> // other config
> OrcFile.WriterOptions writerOptions =
> OrcFile.writerOptions(writerConf);
>  writerOptions.blockPadding(true);
> writerOptions.stripeSize(122 * 1024 * 1024);
>  // other options
>  Writer writer = OrcFile.createWriter(path, writerOptions);
>
>
>
> *Grant Overby*
> Software Engineer
> Cisco.com 
> grove...@cisco.com
> Mobile: *865 724 4910 <865%20724%204910>*
>
>
>
>Think before you print.
>
> This email may contain confidential and privileged material for the sole
> use of the intended recipient. Any review, use, distribution or disclosure
> by others is strictly prohibited. If you are not the intended recipient (or
> authorized to receive for the recipient), please contact the sender by
> reply email and delete all copies of this message.
>
> Please click here
>  for
> Company Registration Information.
>
>
>
>
>   From: Alan Gates 
> Reply-To: "user@hive.apache.org" 
> Date: Monday, April 27, 2015 at 2:05 PM
> To: "user@hive.apache.org" 
> Subject: Re: ORC file across multiple HDFS blocks
>
>  to cross blocks and hence n
>

Re: ORC file across multiple HDFS blocks

2015-04-28 Thread Owen O'Malley

You can also use the C++ reader to read a set of stripes. Look at the
ReaderOptions.range(offset, length), which selects the range of stripes to
process in terms of bytes.

.. Owen

On Tue, Apr 28, 2015 at 11:02 AM, Demai Ni  wrote:

> Alan and Grant,
>
> many thanks. Grant's comment is exact on the point that I am exploring.
>
> A bit background here. I am working on a MPP way to read ORC files through
> this C++ API (https://github.com/hortonworks/orc) by Owen and team. The
> MPP mechanism is using one(or several) independent process per each HDFS
> node, and work like a Client code to read ORC file(s). Currently, the
> assignment of each process is scheduled at ORC file level, which would
> encounter the issue of "lost of data locality" described by Grant.  I
> didn't realize that we can make the scheduling at stripe-level.  Good to
> know that, which surely make sense.
>
> Demai
>
> On Tue, Apr 28, 2015 at 8:34 AM, Grant Overby (groverby) <
> grove...@cisco.com> wrote:
>
>>  Expanding on Alan’s post:
>>
>>  Files are intended to span many blocks and a single file may be read by
>> many mappers. In order for a file to be read by many mappers, it goes
>> through a process called input splits which splits the input around hdfs
>> block boundaries.
>>
>>  If a unit of data within a file crosses a hdfs block, a portion of that
>> unit of data must be sent from the node which contains block/mapper of one
>> portion to the node that contains the block/mapper of the other portion.
>> Take a csv file for example, in this case a unit of data is a line, and
>> transferring a portion of a line between boxes is no big deal.
>>
>>  This changes a bit for orc files as the unit of data is a stripe. An
>> orc stripe is typically a few hundred MB. Without some additional logic, a
>> substantial part of data locality would be lost; however, orc has such
>> additional logic. The stripe size of the orc file should be set a few MB
>> below the hdfs block size and padding enable to produce a 1:1 relationship
>> between an orc stripe and an hdfs block. How many stripes or blocks that
>> are “in" a single file is of no consequence so long as this 1:1
>> relationship is maintained.
>>
>>  Below is an example config for 128mb hdfs blocks.
>>
>>  Configuration writerConf = new Configuration();
>> // other config
>> OrcFile.WriterOptions writerOptions =
>> OrcFile.writerOptions(writerConf);
>>  writerOptions.blockPadding(true);
>> writerOptions.stripeSize(122 * 1024 * 1024);
>>  // other options
>>  Writer writer = OrcFile.createWriter(path, writerOptions);
>>
>>
>>
>> *Grant Overby*
>> Software Engineer
>> Cisco.com 
>> grove...@cisco.com
>> Mobile: *865 724 4910 <865%20724%204910>*
>>
>>
>>
>>Think before you print.
>>
>> This email may contain confidential and privileged material for the sole
>> use of the intended recipient. Any review, use, distribution or disclosure
>> by others is strictly prohibited. If you are not the intended recipient (or
>> authorized to receive for the recipient), please contact the sender by
>> reply email and delete all copies of this message.
>>
>> Please click here
>>  for
>> Company Registration Information.
>>
>>
>>
>>
>>   From: Alan Gates 
>> Reply-To: "user@hive.apache.org" 
>> Date: Monday, April 27, 2015 at 2:05 PM
>> To: "user@hive.apache.org" 
>> Subject: Re: ORC file across multiple HDFS blocks
>>
>>  to cross blocks and hence n
>>
>
>

Re: ORC file across multiple HDFS blocks

2015-04-28 Thread Demai Ni

Owen,

cool.  That is great. Thanks

Demai

On Tue, Apr 28, 2015 at 11:10 AM, Owen O'Malley  wrote:

> You can also use the C++ reader to read a set of stripes. Look at the
> ReaderOptions.range(offset, length), which selects the range of stripes to
> process in terms of bytes.
>
> .. Owen
>
> On Tue, Apr 28, 2015 at 11:02 AM, Demai Ni  wrote:
>
>> Alan and Grant,
>>
>> many thanks. Grant's comment is exact on the point that I am exploring.
>>
>> A bit background here. I am working on a MPP way to read ORC files
>> through this C++ API (https://github.com/hortonworks/orc) by Owen and
>> team. The MPP mechanism is using one(or several) independent process per
>> each HDFS node, and work like a Client code to read ORC file(s). Currently,
>> the assignment of each process is scheduled at ORC file level, which would
>> encounter the issue of "lost of data locality" described by Grant.  I
>> didn't realize that we can make the scheduling at stripe-level.  Good to
>> know that, which surely make sense.
>>
>> Demai
>>
>> On Tue, Apr 28, 2015 at 8:34 AM, Grant Overby (groverby) <
>> grove...@cisco.com> wrote:
>>
>>>  Expanding on Alan’s post:
>>>
>>>  Files are intended to span many blocks and a single file may be read
>>> by many mappers. In order for a file to be read by many mappers, it goes
>>> through a process called input splits which splits the input around hdfs
>>> block boundaries.
>>>
>>>  If a unit of data within a file crosses a hdfs block, a portion of
>>> that unit of data must be sent from the node which contains block/mapper of
>>> one portion to the node that contains the block/mapper of the other
>>> portion. Take a csv file for example, in this case a unit of data is a
>>> line, and transferring a portion of a line between boxes is no big deal.
>>>
>>>  This changes a bit for orc files as the unit of data is a stripe. An
>>> orc stripe is typically a few hundred MB. Without some additional logic, a
>>> substantial part of data locality would be lost; however, orc has such
>>> additional logic. The stripe size of the orc file should be set a few MB
>>> below the hdfs block size and padding enable to produce a 1:1 relationship
>>> between an orc stripe and an hdfs block. How many stripes or blocks that
>>> are “in" a single file is of no consequence so long as this 1:1
>>> relationship is maintained.
>>>
>>>  Below is an example config for 128mb hdfs blocks.
>>>
>>>  Configuration writerConf = new Configuration();
>>> // other config
>>> OrcFile.WriterOptions writerOptions =
>>> OrcFile.writerOptions(writerConf);
>>>  writerOptions.blockPadding(true);
>>> writerOptions.stripeSize(122 * 1024 * 1024);
>>>  // other options
>>>  Writer writer = OrcFile.createWriter(path, writerOptions);
>>>
>>>
>>>
>>> *Grant Overby*
>>> Software Engineer
>>> Cisco.com 
>>> grove...@cisco.com
>>> Mobile: *865 724 4910 <865%20724%204910>*
>>>
>>>
>>>
>>>Think before you print.
>>>
>>> This email may contain confidential and privileged material for the sole
>>> use of the intended recipient. Any review, use, distribution or disclosure
>>> by others is strictly prohibited. If you are not the intended recipient (or
>>> authorized to receive for the recipient), please contact the sender by
>>> reply email and delete all copies of this message.
>>>
>>> Please click here
>>>  for
>>> Company Registration Information.
>>>
>>>
>>>
>>>
>>>   From: Alan Gates 
>>> Reply-To: "user@hive.apache.org" 
>>> Date: Monday, April 27, 2015 at 2:05 PM
>>> To: "user@hive.apache.org" 
>>> Subject: Re: ORC file across multiple HDFS blocks
>>>
>>>  to cross blocks and hence n
>>>
>>
>>
>

hive metastore's schematool -upgradeSchema on postgres throws an error on CREATE TABLE PART_COL_STATS

2015-04-28 Thread jun aoki

Hi hive community,

I am new to Hive and it may be a stupid question but let me know if you
know the answer.

I am attempting to upgrade hive metastore schema from 0.12 to 0.14. The
whole log is here [2]

At the end, VERSION table shows SCHEMA_VERSION 0.14.0. [1] which was 0.12.0
and it seems successful. However, if you take a closer look at the log, you
find an error "Error: ERROR: relation "PART_COL_STATS" already exists"

The error seems occured from pre-0-upgrade-0.13.0-to-0.14.0.postgres.sql,
which is a sole CREATE TABLE command (e.g.
https://github.com/apache/hive/blob/branch-0.14/metastore/scripts/upgrade/postgres/pre-0-upgrade-0.13.0-to-0.14.0.postgres.sql),
and it is maybe OK to fail that way since my current postgres already has
that table.

I never tried but mysql's upgrade scripts has "IF NOT EXISTS" on CREATE
TABLE, I think the error won't show up . (e.g.
https://github.com/apache/hive/blob/master/metastore/scripts/upgrade/mysql/019-HIVE-7784.mysql.sql
)


Questions are
(a) is this considered as successful upgrade since pre-0-upgrade...sql is a
sole create table?
(b) Is this a legitimate bug specific to postgres in the hive product
(specifically hive metastore schematool and missing "IF NOT EXIST"?)




[1]
[root@rhel65-4 database_backup]# ; psql -p10432 -d metastore hive -c'\d
"DBS"' ; psql -p10432 -d metastore hive -c'SELECT * FROM "VERSION"' ;psql
-p10432 -d metastore hive -c'\d "PART_COL_STATS"'
psql -p10432 -d metastore hive -c'SELECT * FROM "VERSION"'
 VER_ID | SCHEMA_VERSION |   VERSION_COMMENT
++-
  1 | 0.14.0 | Hive release version 0.14.0
(1 row)



[2] The whole log on upgradeScema
[root@rhel65-4 hive]# su -s /bin/bash - hdfs -c
'HIVE_CONF_DIR=/etc/hive/conf.server
/usr/phd/current/hive-metastore/bin/schematool -upgradeSchema -dbType
postgres -verbose'
15/04/28 16:28:09 WARN conf.HiveConf: HiveConf of name
hive.optimize.mapjoin.mapreduce does not exist
15/04/28 16:28:09 WARN conf.HiveConf: HiveConf of name hive.heapsize does
not exist
15/04/28 16:28:09 WARN conf.HiveConf: HiveConf of name
hive.server2.enable.impersonation does not exist
15/04/28 16:28:09 WARN conf.HiveConf: HiveConf of name
hive.auto.convert.sortmerge.join.noconditionaltask does not exist
Metastore connection URL:
 jdbc:postgresql://rhel65-4.localdomain:10432/metastore
Metastore Connection Driver :org.postgresql.Driver
Metastore connection User:   hive
Starting upgrade metastore schema from version 0.12.0 to 0.14.0
Upgrade script upgrade-0.12.0-to-0.13.0.postgres.sql
Looking for pre-0-upgrade-0.12.0-to-0.13.0.postgres.sql in
/usr/phd/3.0.0.0-247/hive/scripts/metastore/upgrade/postgres
Connecting to jdbc:postgresql://rhel65-4.localdomain:10432/metastore
Connected to: PostgreSQL (version 8.4.18)
Driver: PostgreSQL Native Driver (version PostgreSQL 8.4 JDBC4 (build 701))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:postgresql://rhel65-4.localdomain:104> !autocommit on
Autocommit status: true
0: jdbc:postgresql://rhel65-4.localdomain:104> CREATE LANGUAGE plpgsql
Error: ERROR: language "plpgsql" already exists (state=42710,code=0)

Closing: 0: jdbc:postgresql://rhel65-4.localdomain:10432/metastore
Warning in pre-upgrade script pre-0-upgrade-0.12.0-to-0.13.0.postgres.sql:
Schema script failed, errorcode 2
java.io.IOException: Schema script failed, errorcode 2
at
org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:380)
at
org.apache.hive.beeline.HiveSchemaTool.runBeeLine(HiveSchemaTool.java:353)
at
org.apache.hive.beeline.HiveSchemaTool.runPreUpgrade(HiveSchemaTool.java:323)
at
org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:243)
at
org.apache.hive.beeline.HiveSchemaTool.doUpgrade(HiveSchemaTool.java:217)
at
org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:493)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Looking for pre-1-upgrade-0.12.0-to-0.13.0.postgres.sql in
/usr/phd/3.0.0.0-247/hive/scripts/metastore/upgrade/postgres
Connecting to jdbc:postgresql://rhel65-4.localdomain:10432/metastore
Connected to: PostgreSQL (version 8.4.18)
Driver: PostgreSQL Native Driver (version PostgreSQL 8.4 JDBC4 (build 701))
Transaction isolation: TRANSACTION_READ_COMMITTED
0: jdbc:postgresql://rhel65-4.localdomain:104> !autocommit on
Autocommit status: true
0: jdbc:postgresql://rhel65-4.localdomain:104> SELECT 'Upgrading MetaStore
schema from 0.12.0 to 0.13.0'
+---+--+
| ?column?

RE: hive metastore's schematool -upgradeSchema on postgres throws an error on CREATE TABLE PART_COL_STATS

2015-04-28 Thread Mich Talebzadeh

Hi,

 

My version is 0.14 on Oracle metastore and there is no drop command there. 
Table seems to keep partition column stats. So it is just stats table

 

 

CREATE TABLE PART_COL_STATS (

CS_ID NUMBER NOT NULL,

DB_NAME VARCHAR2(128) NOT NULL,

TABLE_NAME VARCHAR2(128) NOT NULL,

PARTITION_NAME VARCHAR2(767) NOT NULL,

COLUMN_NAME VARCHAR2(128) NOT NULL,

COLUMN_TYPE VARCHAR2(128) NOT NULL,

PART_ID NUMBER NOT NULL,

LONG_LOW_VALUE NUMBER,

LONG_HIGH_VALUE NUMBER,

DOUBLE_LOW_VALUE NUMBER,

DOUBLE_HIGH_VALUE NUMBER,

BIG_DECIMAL_LOW_VALUE VARCHAR2(4000),

BIG_DECIMAL_HIGH_VALUE VARCHAR2(4000),

NUM_NULLS NUMBER NOT NULL,

NUM_DISTINCTS NUMBER,

AVG_COL_LEN NUMBER,

MAX_COL_LEN NUMBER,

NUM_TRUES NUMBER,

NUM_FALSES NUMBER,

LAST_ANALYZED NUMBER NOT NULL

);

 

ALTER TABLE PART_COL_STATS ADD CONSTRAINT PART_COL_STATS_PKEY PRIMARY KEY 
(CS_ID);

ALTER TABLE PART_COL_STATS ADD CONSTRAINT PART_COL_STATS_FK FOREIGN KEY 
(PART_ID) REFERENCES PARTITIONS (PART_ID) INITIALLY DEFERRED;

CREATE INDEX PART_COL_STATS_N49 ON PART_COL_STATS (PART_ID);

CREATE INDEX PCS_STATS_IDX ON PART_COL_STATS 
(DB_NAME,TABLE_NAME,COLUMN_NAME,PARTITION_NAME);

 

Can you compare table defs with your previous version and 0.14 and see anything 
has changed in DDL? Do you have any records in that table? Mine is empty

 

HTH

 

Mich Talebzadeh

 

http://talebzadehmich.wordpress.com

 

Author of the books "A Practitioner’s Guide to Upgrading to Sybase ASE 15", 
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN 
978-0-9759693-0-4

Publications due shortly:

Creating in-memory Data Grid for Trading Systems with Oracle TimesTen and 
Coherence Cache

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume one 
out shortly

 

NOTE: The information in this email is proprietary and confidential. This 
message is for the designated recipient only, if you are not the intended 
recipient, you should destroy it immediately. Any information in this message 
shall not be understood as given or endorsed by Peridale Ltd, its subsidiaries 
or their employees, unless expressly so stated. It is the responsibility of the 
recipient to ensure that this email is virus free, therefore neither Peridale 
Ltd, its subsidiaries nor their employees accept any responsibility.

 

From: jun aoki [mailto:ja...@apache.org] 
Sent: 29 April 2015 01:33
To: user@hive.apache.org
Subject: hive metastore's schematool -upgradeSchema on postgres throws an error 
on CREATE TABLE PART_COL_STATS

 

Hi hive community,

 

I am new to Hive and it may be a stupid question but let me know if you know 
the answer. 

 

I am attempting to upgrade hive metastore schema from 0.12 to 0.14. The whole 
log is here [2]

 

At the end, VERSION table shows SCHEMA_VERSION 0.14.0. [1] which was 0.12.0 and 
it seems successful. However, if you take a closer look at the log, you find an 
error "Error: ERROR: relation "PART_COL_STATS" already exists" 

 

The error seems occured from pre-0-upgrade-0.13.0-to-0.14.0.postgres.sql, which 
is a sole CREATE TABLE command (e.g. 
https://github.com/apache/hive/blob/branch-0.14/metastore/scripts/upgrade/postgres/pre-0-upgrade-0.13.0-to-0.14.0.postgres.sql),
 and it is maybe OK to fail that way since my current postgres already has that 
table.

 

I never tried but mysql's upgrade scripts has "IF NOT EXISTS" on CREATE TABLE, 
I think the error won't show up . (e.g. 
https://github.com/apache/hive/blob/master/metastore/scripts/upgrade/mysql/019-HIVE-7784.mysql.sql)
 

 

 

Questions are 

(a) is this considered as successful upgrade since pre-0-upgrade...sql is a 
sole create table?

(b) Is this a legitimate bug specific to postgres in the hive product 
(specifically hive metastore schematool and missing "IF NOT EXIST"?)

 

 

 

 

[1]

[root@rhel65-4 database_backup]# ; psql -p10432 -d metastore hive -c'\d "DBS"' 
; psql -p10432 -d metastore hive -c'SELECT * FROM "VERSION"' ;psql -p10432 -d 
metastore hive -c'\d "PART_COL_STATS"'

psql -p10432 -d metastore hive -c'SELECT * FROM "VERSION"'

 VER_ID | SCHEMA_VERSION |   VERSION_COMMENT

++-

  1 | 0.14.0 | Hive release version 0.14.0

(1 row)

 

 




[2] The whole log on upgradeScema

[root@rhel65-4 hive]# su -s /bin/bash - hdfs -c 
'HIVE_CONF_DIR=/etc/hive/conf.server 
/usr/phd/current/hive-metastore/bin/schematool -upgradeSchema -dbType postgres 
-verbose'

15/04/28 16:28:09 WARN conf.HiveConf: HiveConf of name 
hive.optimize.mapjoin.mapreduce does not exist

15/04/28 16:28:09 WARN conf.HiveConf: HiveConf of name hive.heapsize does not 
exist

15/04/28 16:28:09 WARN conf.HiveConf: HiveConf of name 
hive.server2.enable.impersonation does not exist

15/04/28 16:28:09 WARN conf.HiveConf: HiveConf of name 
hive.auto.convert.sortmerge.join.noconditionaltask does not exist

Metastore connection URL:
jdbc:postgresql://rhel65-4.localdomain:10432/metastore

Metastore Co

Re: ORC file across multiple HDFS blocks

default number of reducers

Re: ORC file across multiple HDFS blocks

Re: ORC file across multiple HDFS blocks

Re: ORC file across multiple HDFS blocks

hive metastore's schematool -upgradeSchema on postgres throws an error on CREATE TABLE PART_COL_STATS

RE: hive metastore's schematool -upgradeSchema on postgres throws an error on CREATE TABLE PART_COL_STATS

7 matches

Site Navigation

Mail list logo

Footer information