from:"Austin Hackett"

Re: "org.apache.thrift.transport.TTransportException: Invalid status -128" errors when SASL is enabled

2024-01-11 Thread Austin Hackett

For the benefit of anyone who comes across this error in future, it was solved 
by adding hive.metastore.sasl.enabled and hive.metastore.kerberos.principal to 
hive-site.xml on the client side, e.g. $SPARK_HOME/conf


> On 8 Jan 2024, at 16:18, Austin Hackett  wrote:
> 
> Hi List
>  
> I'm having an issue where Hive Metastore operations (e.g. show databases) are 
> failing with "org.apache.thrift.transport.TTransportException: Invalid status 
> -128" errors when I enable SASL.
>  
> I am a bit stuck on how to go about troubleshooting this further, and any 
> pointers would be greatly apprecicated...
>  
> Full details as follows:
>  
> - Ubuntu 22.04 & OpenJDK 8u342
> - Unpacked Hive 3.1.3 binary release 
> (https://dlcdn.apache.org/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz) to 
> /opt/hive
> - Unpacked Hadoop 3.1.0 binary release 
> (https://archive.apache.org/dist/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz)
>  to /opt/hadoop
> - Created /opt/hive/conf/metastore-site.xml (see below for contents) and 
> copied hdfs-site.xml and core-site.xml from the target HDFS cluster to 
> /opt/hive/conf
> - export HADOOP_HOME=/opt/hadoop
> - export HIVE_HOME=/opt/hive
> - Successfully started the metastore, i.e. hive --service metastore
> - Use a Hive Metastore client to "show databases" and get an error (see below 
> for the associated errors in the HMS log). I get the same error with 
> spark-shell running in local mode and the Python hive-metastore-client 
> (https://pypi.org/project/hive-metastore-client/)
>  
>  
> metastore-site.xml
> ==
> 
>   
> metastore.warehouse.dir
> /user/hive/warehouse
>   
>   
> javax.jdo.option.ConnectionDriverName
> org.postgresql.Driver
>   
>   
> javax.jdo.option.ConnectionURL
> jdbc:postgresql://postgres.example.net:5432/metastore_db 
> 
>   
>   
> javax.jdo.option.ConnectionUserName
> hive
>   
>   
> javax.jdo.option.ConnectionPassword
> password
>   
>   
> metastore.kerberos.principal
> hive/_h...@example.net <mailto:hive/_h...@example.net%3c/value>>
>   
>   
> metastore.kerberos.keytab.file
> /etc/security/keytabs/hive.keytab
>   
>   
> hive.metastore.sasl.enabled
> true
>   
> 
> ==
>  
> HMS log shows that it is able to authenticate using the specified keytab and 
> principle (and I have also checked this manually via kinit command):
>  
> 
> 2024-01-08T13:12:33,463  WARN [main] security.HadoopThriftAuthBridge: 
> Client-facing principal not set. Using server-side setting: 
> hive/_h...@example.net <mailto:hive/_h...@example.net>
> 2024-01-08T13:12:33,464  INFO [main] security.HadoopThriftAuthBridge: Logging 
> in via CLIENT based principal
> 2024-01-08T13:12:33,471 DEBUG [main] security.UserGroupInformation: Hadoop 
> login
> 2024-01-08T13:12:33,472 DEBUG [main] security.UserGroupInformation: hadoop 
> login commit
> 2024-01-08T13:12:33,472 DEBUG [main] security.UserGroupInformation: Using 
> kerberos user: hive/metstore.example@example.net 
> <mailto:hive/metstore.example@example.net>
> 2024-01-08T13:12:33,472 DEBUG [main] security.UserGroupInformation: Using 
> user: "hive/metstore.example@example.net 
> <mailto:hive/metstore.example@example.net>" with name: 
> hive/metstore.example@example.net 
> <mailto:hive/metstore.example@example.net>
> 2024-01-08T13:12:33,472 DEBUG [main] security.UserGroupInformation: User 
> entry: "hive/metstore.example@example.net 
> <mailto:hive/metstore.example@example.net>"
> 2024-01-08T13:12:33,472  INFO [main] security.UserGroupInformation: Login 
> successful for user hive/metstore.example@example.net 
> <mailto:hive/metstore.example@example.net> using keytab file hive.keytab. 
> Keytab auto renewal enabled : false
> 2024-01-08T13:12:33,472  INFO [main] security.HadoopThriftAuthBridge: Logging 
> in via SERVER based principal
> 2024-01-08T13:12:33,480 DEBUG [main] security.UserGroupInformation: Hadoop 
> login
> 2024-01-08T13:12:33,480 DEBUG [main] security.UserGroupInformation: hadoop 
> login commit
> 2024-01-08T13:12:33,480 DEBUG [main] security.UserGroupInformation: Using 
> kerberos user: hive/metstore.example@example.net 
> <mailto:hive/metstore.example@example.net>
> 2024-01-08T13:12:33,480 DEBUG [main] security.UserGroupInformation: Using 
> user: "hive/metstore.example@example.net 
> <mailto:hive/metstore.example@example.net>" with name: 
> hive/metstore.example@example.net 
> <mailto:hive/metstore.exa

"org.apache.thrift.transport.TTransportException: Invalid status -128" errors when SASL is enabled

2024-01-08 Thread Austin Hackett

Hi List
 
I'm having an issue where Hive Metastore operations (e.g. show databases) are 
failing with "org.apache.thrift.transport.TTransportException: Invalid status 
-128" errors when I enable SASL.
 
I am a bit stuck on how to go about troubleshooting this further, and any 
pointers would be greatly apprecicated...
 
Full details as follows:
 
- Ubuntu 22.04 & OpenJDK 8u342
- Unpacked Hive 3.1.3 binary release 
(https://dlcdn.apache.org/hive/hive-3.1.3/apache-hive-3.1.3-bin.tar.gz) to 
/opt/hive
- Unpacked Hadoop 3.1.0 binary release 
(https://archive.apache.org/dist/hadoop/common/hadoop-3.1.0/hadoop-3.1.0.tar.gz)
 to /opt/hadoop
- Created /opt/hive/conf/metastore-site.xml (see below for contents) and copied 
hdfs-site.xml and core-site.xml from the target HDFS cluster to /opt/hive/conf
- export HADOOP_HOME=/opt/hadoop
- export HIVE_HOME=/opt/hive
- Successfully started the metastore, i.e. hive --service metastore
- Use a Hive Metastore client to "show databases" and get an error (see below 
for the associated errors in the HMS log). I get the same error with 
spark-shell running in local mode and the Python hive-metastore-client 
(https://pypi.org/project/hive-metastore-client/)
 
 
metastore-site.xml
==

  
metastore.warehouse.dir
/user/hive/warehouse
  
  
javax.jdo.option.ConnectionDriverName
org.postgresql.Driver
  
  
javax.jdo.option.ConnectionURL
jdbc:postgresql://postgres.example.net:5432/metastore_db 

  
  
javax.jdo.option.ConnectionUserName
hive
  
  
javax.jdo.option.ConnectionPassword
password
  
  
metastore.kerberos.principal
hive/_h...@example.netmailto:hive/_h...@example.net%3c/value>>
  
  
metastore.kerberos.keytab.file
/etc/security/keytabs/hive.keytab
  
  
hive.metastore.sasl.enabled
true
  

==
 
HMS log shows that it is able to authenticate using the specified keytab and 
principle (and I have also checked this manually via kinit command):
 

2024-01-08T13:12:33,463  WARN [main] security.HadoopThriftAuthBridge: 
Client-facing principal not set. Using server-side setting: 
hive/_h...@example.net 
2024-01-08T13:12:33,464  INFO [main] security.HadoopThriftAuthBridge: Logging 
in via CLIENT based principal
2024-01-08T13:12:33,471 DEBUG [main] security.UserGroupInformation: Hadoop login
2024-01-08T13:12:33,472 DEBUG [main] security.UserGroupInformation: hadoop 
login commit
2024-01-08T13:12:33,472 DEBUG [main] security.UserGroupInformation: Using 
kerberos user: hive/metstore.example@example.net 

2024-01-08T13:12:33,472 DEBUG [main] security.UserGroupInformation: Using user: 
"hive/metstore.example@example.net 
" with name: 
hive/metstore.example@example.net 

2024-01-08T13:12:33,472 DEBUG [main] security.UserGroupInformation: User entry: 
"hive/metstore.example@example.net 
"
2024-01-08T13:12:33,472  INFO [main] security.UserGroupInformation: Login 
successful for user hive/metstore.example@example.net 
 using keytab file hive.keytab. 
Keytab auto renewal enabled : false
2024-01-08T13:12:33,472  INFO [main] security.HadoopThriftAuthBridge: Logging 
in via SERVER based principal
2024-01-08T13:12:33,480 DEBUG [main] security.UserGroupInformation: Hadoop login
2024-01-08T13:12:33,480 DEBUG [main] security.UserGroupInformation: hadoop 
login commit
2024-01-08T13:12:33,480 DEBUG [main] security.UserGroupInformation: Using 
kerberos user: hive/metstore.example@example.net 

2024-01-08T13:12:33,480 DEBUG [main] security.UserGroupInformation: Using user: 
"hive/metstore.example@example.net 
" with name: 
hive/metstore.example@example.net 

2024-01-08T13:12:33,480 DEBUG [main] security.UserGroupInformation: User entry: 
"hive/metstore.example@example.net 
"
2024-01-08T13:12:33,480  INFO [main] security.UserGroupInformation: Login 
successful for user hive/metstore.example@example.net 
 using keytab file hive.keytab. 
Keytab auto renewal enabled : false

 
However, when i attempt to "show databases":
 

2024-01-08T13:59:08,068 DEBUG [pool-6-thread-1] security.UserGroupInformation: 
PrivilegedAction [as: hive/metstore.example@example.net 
 
(auth:KERBEROS)][action:org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1@1e655c9
 
]
java.lang.Exception:

Re: Hive 3.1.3 Hadoop Compatability

2023-12-22 Thread Austin Hackett

Many thanks for clarifying Ayush - much appreciated 

> On 22 Dec 2023, at 08:41, Ayush Saxena  wrote:
> 
> Ideally the hadoop should be on 3.1.0 only, that is what we support,
> rest if there are no incompatibilities it might or might not work with
> higher versions of hadoop, we at "hive" don't claim that it can work,
> mostly it will create issues with hadoop-3.3.x line due to thirdparty
> libs and stuff like that, Guava IIRC does create some mess.
> 
> So, short answer: we officially only support the above said hadoop
> versions only for a particular hive release.
> 
> -Ayush
> 
>> On Fri, 22 Dec 2023 at 03:03, Austin Hackett  wrote:
>> 
>> Hi Ayush
>> 
>> Many thanks for your response.
>> 
>> I’d really appreciate a clarification if that’s OK?
>> 
>> Does this just mean that the Hadoop 3.1.0 libraries need to be deployed with 
>> Hive, or does it also mean the Hadoop cluster itself cannot be on a version 
>> later than 3.1.0 (if using Hive 3.1.3).
>> 
>> For example, if running the Hive 3.1.3 Metastore in standalone mode, can the 
>> HMS work with a 3.3.6 HDFS cluster providing the Hadoop 3.1.0 libraries are 
>> deployed alongside the HMS?
>> 
>> Any help is much appreciated
>> 
>> Thank you
>> 
>> 
>> 
>>>> On 21 Dec 2023, at 12:18, Ayush Saxena  wrote:
>>> 
>>> Hi Austin,
>>> Hive 3.1.3 & 4.0.0-alpha-1 works with Hadoop-3.1.0
>>> 
>>> HIve 4.0.0-alpha-2 & 4.0.0-beta-1 works with Hadoop-3.3.1
>>> 
>>> The upcoming Hive 4.0 GA release would be compatible with Hadoop-3.3.6
>>> 
>>> -Ayush
>>> 
>>> On Thu, 21 Dec 2023 at 17:39, Austin Hackett  wrote:
>>>> 
>>>> Hi List
>>>> 
>>>> I was hoping that someone might be able to clarify which Hadoop versions 
>>>> Hive 3.1.3 is compatible with?
>>>> 
>>>> https://hive.apache.org/general/downloads/ says that Hive release 3.1.3 
>>>> works with Hadoop 3.x.y which is straightforward enough.
>>>> 
>>>> However, I notice the 4.0.0 releases only work with Hadoop 3.3.1, which 
>>>> makes we wonder if 3.1.3 doesn’t work actually work with 3.3.1.
>>>> 
>>>> Similarly, I see that HIVE-27757 upgrades Hadoop to 3.3.6 in Hive 4.0.0, 
>>>> which makes me wonder if Hive 4.0.0 actually works with 3.3.6 and not 
>>>> 3.3.1 as mentioned on the releases page.
>>>> 
>>>> In summary: does Hive 3.1.3 work with Hadoop 3.3.6, and if not, which 
>>>> Hadoop 3.x.x versions are known to work?
>>>> 
>>>> Any pointers would be greatly appreciated
>>>> 
>>>> Thank you
>>

Re: Hive 3.1.3 Hadoop Compatability

2023-12-21 Thread Austin Hackett

Hi Ayush

Many thanks for your response.

I’d really appreciate a clarification if that’s OK?

Does this just mean that the Hadoop 3.1.0 libraries need to be deployed with 
Hive, or does it also mean the Hadoop cluster itself cannot be on a version 
later than 3.1.0 (if using Hive 3.1.3).

For example, if running the Hive 3.1.3 Metastore in standalone mode, can the 
HMS work with a 3.3.6 HDFS cluster providing the Hadoop 3.1.0 libraries are 
deployed alongside the HMS?

Any help is much appreciated

Thank you



> On 21 Dec 2023, at 12:18, Ayush Saxena  wrote:
> 
> Hi Austin,
> Hive 3.1.3 & 4.0.0-alpha-1 works with Hadoop-3.1.0
> 
> HIve 4.0.0-alpha-2 & 4.0.0-beta-1 works with Hadoop-3.3.1
> 
> The upcoming Hive 4.0 GA release would be compatible with Hadoop-3.3.6
> 
> -Ayush
> 
> On Thu, 21 Dec 2023 at 17:39, Austin Hackett  wrote:
>> 
>> Hi List
>> 
>> I was hoping that someone might be able to clarify which Hadoop versions 
>> Hive 3.1.3 is compatible with?
>> 
>> https://hive.apache.org/general/downloads/ says that Hive release 3.1.3 
>> works with Hadoop 3.x.y which is straightforward enough.
>> 
>> However, I notice the 4.0.0 releases only work with Hadoop 3.3.1, which 
>> makes we wonder if 3.1.3 doesn’t work actually work with 3.3.1.
>> 
>> Similarly, I see that HIVE-27757 upgrades Hadoop to 3.3.6 in Hive 4.0.0, 
>> which makes me wonder if Hive 4.0.0 actually works with 3.3.6 and not 3.3.1 
>> as mentioned on the releases page.
>> 
>> In summary: does Hive 3.1.3 work with Hadoop 3.3.6, and if not, which Hadoop 
>> 3.x.x versions are known to work?
>> 
>> Any pointers would be greatly appreciated
>> 
>> Thank you

Hive 3.1.3 Hadoop Compatability

2023-12-21 Thread Austin Hackett

Hi List

I was hoping that someone might be able to clarify which Hadoop versions Hive 
3.1.3 is compatible with?

https://hive.apache.org/general/downloads/ says that Hive release 3.1.3 works 
with Hadoop 3.x.y which is straightforward enough.

However, I notice the 4.0.0 releases only work with Hadoop 3.3.1, which makes 
we wonder if 3.1.3 doesn’t work actually work with 3.3.1.

Similarly, I see that HIVE-27757 upgrades Hadoop to 3.3.6 in Hive 4.0.0, which 
makes me wonder if Hive 4.0.0 actually works with 3.3.6 and not 3.3.1 as 
mentioned on the releases page.

In summary: does Hive 3.1.3 work with Hadoop 3.3.6, and if not, which Hadoop 
3.x.x versions are known to work?

Any pointers would be greatly appreciated 

Thank you

Re: Is Insert Overwrite table partition on s3 is an atomic operation ?

2021-01-11 Thread Austin Hackett

Hi Mark

It’s my understanding that when you do an INSERT OVERWRITE into a partition, 
Hive will take out an exclusive lock on the partition and a shared lock on the 
table itself. This blocks are read and write operations on the partition, and 
allows reads against the other partitions to proceed.

I am assuming that you have hive.txn.strict.locking.mode set to its default 
value of true, and the table is non-ACID. With 
hive.txn.strict.locking.mode=false and a non-ACID table, then the lock is 
shared, i.e. concurrent writes to the same table are allowed.

Hopefully that is this information you were looking for? Apologies if not.

Thanks

Austin

> On 11 Jan 2021, at 16:44, Mark Norkin  wrote:
> 
> Hello Hive users,
> 
> We are using AWS Glue as Hive compatible metastore when running queries on 
> EMR. For Hive external tables we are using AWS S3.
> 
> After looking at the docs we didn't find a conclusive answer on whether an 
> Insert Overwrite table partition is an atomic operation, maybe we've missed 
> it and it is documented somewhere, or maybe someone knows from their 
> experience?
> 
> If it's an atomic operation, is there any difference whether the table is 
> external or a managed one?
> 
> Thank you,
> 
> Mark

Re: How useful are tools for Hive data modeling

2020-11-11 Thread Austin Hackett

Hi Mich

Understood, I was thinking along the lines of the tool being able to 
auto-generate SQL join syntax etc, rather than in terms of scan performance.

I’m not so familiar with Parquet with Hive. I know that Parquet also has min 
and max indexes, and more recently bloom filters. However, I recall reading 
that Hive can’t take advantage of them. That might have changed since though? 
In order to make the most of of these, you usually need to sort your data at 
insert time, which may or may not be feasible.

If nicely selective partitioning key, plus a columnar file format (which of 
course Parquet is) doesn’t give you the performance you need, I guess a hand 
rolled "materialised view" is where I’d look next (Hive 3.x does have native MV 
support, but I I think only with ORC).

Thanks

Austin



> On 11 Nov 2020, at 19:59, Mich Talebzadeh  wrote:
> 
> Many thanks Austin.
> 
> The challenge I have been told is how to effectively query a subset of data 
> avoiding full table scan. The tables I believe are parquet.
> 
> I know performance in Hive is not that great, so anything that could help 
> would be great.
> 
> Cheers,
> 
>  
> 
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>  
> 
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>  
> 
> 
> On Wed, 11 Nov 2020 at 19:32, Austin Hackett  <mailto:hacketta...@me.com>> wrote:
> Hi Mich
> 
> Hive also has non-validated primary key, foreign key etc constraints. Whilst 
> I’m not too familiar with the modelling tools you mention, perhaps they’re 
> able to use these for generating SQL etc?
> 
> ORC files have indexes (min, max, bloom filters) - not particularly relevant 
> to the data modelling tools question, but mentioning it for completeness…
> 
> Thanks
> 
> Austin
> 
> 
>> On 11 Nov 2020, at 17:14, Mich Talebzadeh > <mailto:mich.talebza...@gmail.com>> wrote:
>> 
>> Many thanks Peter. 
>> 
>> 
>>  
>> 
>> LinkedIn  
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>  
>> <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>>  
>> 
>> 
>> Disclaimer: Use it at your own risk. Any and all responsibility for any 
>> loss, damage or destruction of data or any other property which may arise 
>> from relying on this email's technical content is explicitly disclaimed. The 
>> author will in no case be liable for any monetary damages arising from such 
>> loss, damage or destruction.
>>  
>> 
>> 
>> On Wed, 11 Nov 2020 at 16:58, Peter Vary > <mailto:pv...@cloudera.com>> wrote:
>> Hi Mich,
>> 
>> Index support was removed from hive:
>> https://issues.apache.org/jira/browse/HIVE-21968 
>> <https://issues.apache.org/jira/browse/HIVE-21968>
>> https://issues.apache.org/jira/browse/HIVE-18715 
>> <https://issues.apache.org/jira/browse/HIVE-18715>
>> 
>> Thanks,
>> Peter
>> 
>>> On Nov 11, 2020, at 17:25, Mich Talebzadeh >> <mailto:mich.talebza...@gmail.com>> wrote:
>>> 
>>> Hi all,
>>> 
>>> I wrote these notes earlier this year. 
>>> 
>>> I heard today that someone mentioned Hive 1 does not support indexes but 
>>> hive 2 does.
>>> 
>>> I still believe that Hive does not support indexing as per below. Has this 
>>> been changed?
>>> 
>>> Regards,
>>> 
>>> Mich
>>> 
>>> -- Forwarded message -
>>> From: Mich Talebzadeh >> <mailto:mich.talebza...@gmail.com>>
>>> Date: Thu, 2 Apr 2020 at 12:17
>>> Subject: How useful are tools for Hive data modeling
>>> To: user mailto:user@hive.apache.org>>
>>> 
>>> 
>>> Hi,
>>> 
>>> Fundamentally Hive tables have structure and support provided by desc 
>>> formatted  and show partitions .
>>> 
>>> Hive does not support indexes in real HQL operations (I stand corrected). 
>>> So what we have are tables, partitions and clustering (AKA hash 
>>> partitioning). 
>>> 
>>> Hive does not support indexes because Hadoop lacks blocks locality 
>>> necessary for indexes. So If I use a tool like Collibra, Ab-intio etc what 
>>> advantage(s) one is going to gain on top a simple sell scrip to get table 
>>> and partition definitions?
>>> 
>>> Thanks,
>>> 
>>> 
>>> Disclaimer: Use it at your own risk. Any and all responsibility for any 
>>> loss, damage or destruction of data or any other property which may arise 
>>> from relying on this email's technical content is explicitly disclaimed. 
>>> The author will in no case be liable for any monetary damages arising from 
>>> such loss, damage or destruction.
>>>  
>> 
>

Re: How useful are tools for Hive data modeling

2020-11-11 Thread Austin Hackett

Hi Mich

Hive also has non-validated primary key, foreign key etc constraints. Whilst 
I’m not too familiar with the modelling tools you mention, perhaps they’re able 
to use these for generating SQL etc?

ORC files have indexes (min, max, bloom filters) - not particularly relevant to 
the data modelling tools question, but mentioning it for completeness…

Thanks

Austin


> On 11 Nov 2020, at 17:14, Mich Talebzadeh  wrote:
> 
> Many thanks Peter. 
> 
> 
>  
> 
> LinkedIn  
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>  
> 
>  
> 
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, 
> damage or destruction of data or any other property which may arise from 
> relying on this email's technical content is explicitly disclaimed. The 
> author will in no case be liable for any monetary damages arising from such 
> loss, damage or destruction.
>  
> 
> 
> On Wed, 11 Nov 2020 at 16:58, Peter Vary  > wrote:
> Hi Mich,
> 
> Index support was removed from hive:
> https://issues.apache.org/jira/browse/HIVE-21968 
> 
> https://issues.apache.org/jira/browse/HIVE-18715 
> 
> 
> Thanks,
> Peter
> 
>> On Nov 11, 2020, at 17:25, Mich Talebzadeh > > wrote:
>> 
>> Hi all,
>> 
>> I wrote these notes earlier this year. 
>> 
>> I heard today that someone mentioned Hive 1 does not support indexes but 
>> hive 2 does.
>> 
>> I still believe that Hive does not support indexing as per below. Has this 
>> been changed?
>> 
>> Regards,
>> 
>> Mich
>> 
>> -- Forwarded message -
>> From: Mich Talebzadeh > >
>> Date: Thu, 2 Apr 2020 at 12:17
>> Subject: How useful are tools for Hive data modeling
>> To: user mailto:user@hive.apache.org>>
>> 
>> 
>> Hi,
>> 
>> Fundamentally Hive tables have structure and support provided by desc 
>> formatted  and show partitions .
>> 
>> Hive does not support indexes in real HQL operations (I stand corrected). So 
>> what we have are tables, partitions and clustering (AKA hash partitioning). 
>> 
>> Hive does not support indexes because Hadoop lacks blocks locality necessary 
>> for indexes. So If I use a tool like Collibra, Ab-intio etc what 
>> advantage(s) one is going to gain on top a simple sell scrip to get table 
>> and partition definitions?
>> 
>> Thanks,
>> 
>> 
>> Disclaimer: Use it at your own risk. Any and all responsibility for any 
>> loss, damage or destruction of data or any other property which may arise 
>> from relying on this email's technical content is explicitly disclaimed. The 
>> author will in no case be liable for any monetary damages arising from such 
>> loss, damage or destruction.
>>  
>

Re: "org.apache.thrift.transport.TTransportException: Invalid status -128" errors when SASL is enabled

"org.apache.thrift.transport.TTransportException: Invalid status -128" errors when SASL is enabled

Re: Hive 3.1.3 Hadoop Compatability

Re: Hive 3.1.3 Hadoop Compatability

Hive 3.1.3 Hadoop Compatability

Re: Is Insert Overwrite table partition on s3 is an atomic operation ?

Re: How useful are tools for Hive data modeling

Re: How useful are tools for Hive data modeling

8 matches

Site Navigation

Mail list logo

Footer information