from:"Todd Lipcon \(JIRA\)"

[jira] [Commented] (HIVE-21749) ACID: Provide an option to run Cleaner thread from Hive client

2019-05-17 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16842418#comment-16842418
 ] 

Todd Lipcon commented on HIVE-21749:


Specifically it would be useful to clean a specified table or database, rather 
than running the full thread. The use case here is in integration tests where I 
want to mutate a table, compact it, run more queries, then clean it, without 
any background cleaning behavior going on.

> ACID: Provide an option to run Cleaner thread from Hive client
> --
>
> Key: HIVE-21749
> URL: https://issues.apache.org/jira/browse/HIVE-21749
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Vaibhav Gumashta
>Assignee: Vaibhav Gumashta
>Priority: Major
>
> In some cases, it could be useful to trigger the cleaner thread manually. We 
> should provide an option for that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21683) ProxyFileSystem breaks with Hadoop trunk

2019-05-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-21683:
---
Status: Patch Available  (was: Open)

> ProxyFileSystem breaks with Hadoop trunk
> 
>
> Key: HIVE-21683
> URL: https://issues.apache.org/jira/browse/HIVE-21683
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive-21683-javassist.patch, hive-21683-simple.patch, 
> hive-21683.patch
>
>
> When trying to run with a recent build of Hadoop which includes HADOOP-15229 
> I ran into the following stack:
> {code}
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> pfile:/src/hive/itests/qtest/target/warehouse/src/kv1.txt, expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:793) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:456)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:153)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:354) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.lambda$openFileWithOptions$0(ChecksumFileSystem.java:846)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.openFileWithOptions(ChecksumFileSystem.java:845)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FileSystem$FSDataInputStreamBuilder.build(FileSystem.java:4522)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:115) 
> ~[hadoop-mapreduce-client-core-3.1.1.6.0.99.0-135.jar:?]{code}
> We need to add appropriate path-swizzling wrappers for the new APIs in 
> ProxyFileSystem23



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21683) ProxyFileSystem breaks with Hadoop trunk

2019-05-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-21683:
---
Attachment: hive-21683.patch

> ProxyFileSystem breaks with Hadoop trunk
> 
>
> Key: HIVE-21683
> URL: https://issues.apache.org/jira/browse/HIVE-21683
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive-21683-javassist.patch, hive-21683-simple.patch, 
> hive-21683.patch
>
>
> When trying to run with a recent build of Hadoop which includes HADOOP-15229 
> I ran into the following stack:
> {code}
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> pfile:/src/hive/itests/qtest/target/warehouse/src/kv1.txt, expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:793) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:456)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:153)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:354) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.lambda$openFileWithOptions$0(ChecksumFileSystem.java:846)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.openFileWithOptions(ChecksumFileSystem.java:845)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FileSystem$FSDataInputStreamBuilder.build(FileSystem.java:4522)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:115) 
> ~[hadoop-mapreduce-client-core-3.1.1.6.0.99.0-135.jar:?]{code}
> We need to add appropriate path-swizzling wrappers for the new APIs in 
> ProxyFileSystem23



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21682) Concurrent queries in tez local mode fail

2019-05-20 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HIVE-21682:
--

Assignee: (was: Todd Lipcon)

I looked into this for a while and after hacking on it for a few days I 
couldn't get things stable (after fixing the IOContext and ObjectRegistry 
issues, I hit issues in that Tez and Hive assume they can write things into the 
current working directory (and different threads in the same process obviously 
share the same working directory). Chasing down all of those cases seemed like 
too much of a pain, so i just moved to a pseudo-distributed YARN cluster for 
our use case.

I think for this to work properly we'd need to make a Tez Local Mode which 
actually forks out separate JVMs for each Tez child instead of using threads.

> Concurrent queries in tez local mode fail
> -
>
> Key: HIVE-21682
> URL: https://issues.apache.org/jira/browse/HIVE-21682
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Priority: Major
>
> As noted in TEZ-3420, Hive running with Tez local mode breaks if multiple 
> queries are submitted concurrently. As I noted 
> [there|https://issues.apache.org/jira/browse/TEZ-3420?focusedCommentId=16831937&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16831937]
>  it seems part of the problem is Hive's use of static global state for 
> IOContext in the case of Tez. Another issue is the use of a JVM-wide 
> ObjectRegistry



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21757) ACID: use a new write id for compaction's output instead of the visibility id

2019-05-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844284#comment-16844284
 ] 

Todd Lipcon commented on HIVE-21757:


This is particularly important if we want to be able to cache the file listings 
for a table based on the table's latest write ID. If the compactor doesn't 
change write IDs, but changes the set of files, then that caching strategy 
becomes impossible. Given that file listing is pretty expensive on cloud 
stores, caching them can be quite useful for low-latency queries.

It seems likely this could cause problems for things like replication as well.

> ACID: use a new write id for compaction's output instead of the visibility id
> -
>
> Key: HIVE-21757
> URL: https://issues.apache.org/jira/browse/HIVE-21757
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Vaibhav Gumashta
>Priority: Major
>
> HIVE-20823 added support for running compaction within a transaction. To 
> control the visibility of the output directory, it uses 
> base_writeId_visibilityId, where visibilityId is the transaction id of the 
> transaction that the compactor ran in. Perhaps we can keep using the 
> base_writeId format, by allocating a new writeId for the compactor and 
> creating the new base/delta with that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21757) ACID: use a new write id for compaction's output instead of the visibility id

2019-05-20 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16844553#comment-16844553
 ] 

Todd Lipcon commented on HIVE-21757:


Can we introduce some other table-level marker indicating that a compaction has 
run, then? We need something to be able to safely cache file lists, and the 
global list of committed txns isn't useful for that, considering it changes on 
every query.

> ACID: use a new write id for compaction's output instead of the visibility id
> -
>
> Key: HIVE-21757
> URL: https://issues.apache.org/jira/browse/HIVE-21757
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Vaibhav Gumashta
>Priority: Major
>
> HIVE-20823 added support for running compaction within a transaction. To 
> control the visibility of the output directory, it uses 
> base_writeId_visibilityId, where visibilityId is the transaction id of the 
> transaction that the compactor ran in. Perhaps we can keep using the 
> base_writeId format, by allocating a new writeId for the compactor and 
> creating the new base/delta with that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21757) ACID: use a new write id for compaction's output instead of the visibility id

2019-05-23 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16846500#comment-16846500
 ] 

Todd Lipcon commented on HIVE-21757:


[~gopalv] what do you think about allocating a writeID for the compaction, but 
not using it in the filename for the base data? This would serve to bump the 
table's writeID and change its validWriteIdList, which is what we need to 
invalidate the file-listing cache in Impala. Without this, we'll end up getting 
query failures after the cleaner has run.


> ACID: use a new write id for compaction's output instead of the visibility id
> -
>
> Key: HIVE-21757
> URL: https://issues.apache.org/jira/browse/HIVE-21757
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 4.0.0
>Reporter: Vaibhav Gumashta
>Priority: Major
>
> HIVE-20823 added support for running compaction within a transaction. To 
> control the visibility of the output directory, it uses 
> base_writeId_visibilityId, where visibilityId is the transaction id of the 
> transaction that the compactor ran in. Perhaps we can keep using the 
> base_writeId format, by allocating a new writeId for the compactor and 
> creating the new base/delta with that.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21667) Bad error message when non-ACID files are put in an insert_only ACID table

2019-05-23 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16846505#comment-16846505
 ] 

Todd Lipcon commented on HIVE-21667:


[~vgumashta] any thoughts on this? I was under the impression that a 
post-upgrade ACID table would have files lying around in the old non-ACID 
layout, because we don't want to incur a bunch of 'move' operations during 
conversion from non-ACID to ACID. Am I wrong about that? In Impala's 
implementation, should we bail if we see a non-conforming file in an ACID table 
directory?

> Bad error message when non-ACID files are put in an insert_only ACID table
> --
>
> Key: HIVE-21667
> URL: https://issues.apache.org/jira/browse/HIVE-21667
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Affects Versions: 3.1.0
>Reporter: Todd Lipcon
>Priority: Major
>
> I created an insert_only transactional table, and then dropped a text file 
> into the table's directory from a non-transaction-aware client. When I next 
> queried the table, I got the following error:
> Error: java.io.IOException: java.io.IOException: Not a file: 
> hdfs://localhost:20500/test-warehouse/trans/delta_002_002_ 
> (state=,code=0)
> It seems that Hive saw the non-ACID file and fell back to some kind of 
> non-ACID reader path, but then got confused by the delta directory. This case 
> should either fall back to gracefully reading the file, or give an error 
> message like "Unexpected file not conforming to ACID layout: . Data 
> must be loaded using into transactional tables LOAD DATA." or something.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21796) ArrayWritableObjectInspector.equals can take O(2^nesting_depth) time

2019-05-29 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16851089#comment-16851089
 ] 

Todd Lipcon commented on HIVE-21796:


I looked through this code and it wasn't obvious to me how this ends up being 
exponential-time. I would have thought that the recursion depth would be 
exactly the nesting depth without any "fan-out".

> ArrayWritableObjectInspector.equals can take O(2^nesting_depth) time
> 
>
> Key: HIVE-21796
> URL: https://issues.apache.org/jira/browse/HIVE-21796
> Project: Hive
>  Issue Type: Bug
>Reporter: Csaba Ringhofer
>Priority: Major
>
> The issue came up during an Impala test when we tried to run it with Hive 
> 3.1. The a query hanged: it tried to insert 1 row from a Parquet file with a 
> 99 level nested column to a similar Orc file, and spent its time in 
> ArrayWritableObjectInspector.equals() according to jstack.
> The problem seems to be that equals()  calls both fields.equals(that.fields) 
> and fieldsByName.equals(that.fieldsByName), and both try to compare all 
> nested fields recursively, which leads to the O(2^nesting_depth) complexity.
> The commit that introduced this behavior:
> https://github.com/apache/hive/commit/98a25f2d831ab27e174bc99792047eaa8ec08b82#diff-8c6363e90d442f239bc252a104f1bfed
> The Impala test:
> https://github.com/apache/impala/blob/9ee4a5e1940afa47227a92e0f6fba6d4c9909f63/tests/query_test/test_nested_types.py#L612



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21666) Bad behavior with 'insert_only' property on non-transactional table

2019-06-07 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858791#comment-16858791
 ] 

Todd Lipcon commented on HIVE-21666:


I can repro with the same commands you have above using a vendor branch of Hive 
3.1. Note that in my install I haven't configured the 'create as acid' option 
which makes tables transactional by default. I am also running using Tez 
execution and not LLAP. Perhaps your installation has that set, so that the 
table becomes ACID by default? Maybe you can repro using 'create external 
table' instead?

> Bad behavior with 'insert_only' property on non-transactional table
> ---
>
> Key: HIVE-21666
> URL: https://issues.apache.org/jira/browse/HIVE-21666
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Todd Lipcon
>Assignee: Attila Magyar
>Priority: Major
>
> I created a table with 'transactional_properties'='insert_only' but didn't 
> specify 'transactional'='TRUE'. I was able to insert into this table, but 
> when I tried to query it, I got a NumberFormatException since it appears the 
> insert path used a non-transactional layout while the read path expected one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21666) Bad behavior with 'insert_only' property on non-transactional table

2019-06-12 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16862502#comment-16862502
 ] 

Todd Lipcon commented on HIVE-21666:


That fix seems to have fixed it for CTAS, but does it fix it for other ways of 
ending up with a table like this? eg if someone uses an external HMS client (eg 
Spark or Impala) to create such a malformed table, or uses an ALTER TABLE 
command to set table properties, we should also perform the same validation at 
read time, rather than giving this strange error.

> Bad behavior with 'insert_only' property on non-transactional table
> ---
>
> Key: HIVE-21666
> URL: https://issues.apache.org/jira/browse/HIVE-21666
> Project: Hive
>  Issue Type: Bug
>  Components: Transactions
>Reporter: Todd Lipcon
>Assignee: Attila Magyar
>Priority: Major
>
> I created a table with 'transactional_properties'='insert_only' but didn't 
> specify 'transactional'='TRUE'. I was able to insert into this table, but 
> when I tried to query it, I got a NumberFormatException since it appears the 
> insert path used a non-transactional layout while the read path expected one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21487) COMPLETED_COMPACTIONS table missing appropriate indexes

2019-03-21 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798224#comment-16798224
 ] 

Todd Lipcon commented on HIVE-21487:


Seems that this cluster is in some state where all compactions are failing (I 
think due to some other testing going on with a strange setup). The query is 
likely coming from CompactionTxnHandler.checkFailedCompactions.

It's possible this isn't a hot enough code path to be worth a really large 
index, but perhaps it's worth one at least on ethe database/table level.

> COMPLETED_COMPACTIONS table missing appropriate indexes
> ---
>
> Key: HIVE-21487
> URL: https://issues.apache.org/jira/browse/HIVE-21487
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Priority: Major
>
> Looking at a MySQL install where HMS is pointed on Hive 3.1, I see a constant 
> stream of queries of the form:
> {code}
> select CC_STATE from COMPLETED_COMPACTIONS where CC_DATABASE = 
> 'tpcds_orc_exact_1000' and CC_TABLE = 'catalog_returns' and CC_PARTITION = 
> 'cr_returned_date_sk=2452851' and CC_STATE != 'a' order by CC_ID desc;
> {code}
> but the COMPLETED_COMPACTIONS table has no index. In this case it's resulting 
> in a full table scan over 115k rows, which takes around 100ms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21571) SHOW COMPACTIONS shows column names as its first output row

2019-04-03 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16809311#comment-16809311
 ] 

Todd Lipcon commented on HIVE-21571:


{code}
+---+---+--++++---+-+---+--+
| compactionid  |  dbname   | tabname  |  partname  |  type  |   state| 
workerid  |  starttime  |   duration| hadoopjobid  |
+---+---+--++++---+-+---+--+
| CompactionId  | Database  | Table| Partition  | Type   | State  | 
Worker| Start Time  | Duration(ms)  | HadoopJobId  |
| 1 | default   | t2   |  ---   | MAJOR  | initiated  |  
---  |  ---|  ---  | None |
+---+---+--++++---+-+---+--+
2 rows selected (0.034 seconds)
{code}

> SHOW COMPACTIONS shows column names as its first output row
> ---
>
> Key: HIVE-21571
> URL: https://issues.apache.org/jira/browse/HIVE-21571
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Priority: Major
>
> SHOW COMPACTIONS yields a resultset with nice column names, and then the 
> first row of data is a repetition of those column names. This is somewhat 
> confusing and hard to read.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21506) Memory based TxnHandler implementation

2019-04-04 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810130#comment-16810130
 ] 

Todd Lipcon commented on HIVE-21506:


Does this imply that you'd move the transaction manager out of the HMS into a 
standalone daemon? Then we need to worry about HA for that daemon as well as 
what happens to locks if the daemon crashes, right? It's probably possible but 
would be quite a bit of work.

Another potential design for scaling lock management is to use revocable 
"sticky" locks. I think there's a decent amount of literature from the 
shared-nothing DBMS world on this technique. With some quick googling I found 
that the Frangipani DFS paper has some discussion of the technique, but I think 
we can probably find a more detailed description of it elsewhere.

At a very high level, the idea would look something like this for shared locks:
- when a user wants to acquire a shared lock, the HMS checks if any other 
transaction already has the same shared lock held. If not, we acquire the lock 
in the database, and associate it not with any particular transaction, but with 
the HMS's lock manager itself. The HMS then becomes responsible for 
heartbeating the lock while it's held. In essence, the HMS has now taken out a 
"lease" on this lock.
- if the HMS already has this shared lock held on behalf of another 
transaction, increment an in-memory reference count.
- when a lock is released, if the refcount > 1, simply decrement the in-memory 
ref count.
- if the lock is released and the refcount goes to 0, the HMS can be lazy about 
releasing the lock in the DB (either forever or for some amount of time). In 
essence the lock is "sticky".

Given that most locks are shared locks, this should mean that the majority of 
locking operations do not require any trip to the RDBMS and can be processed in 
memory, but are backed persistently by a lock in the DB.

If a caller wants to acquire an exclusive lock which conflicts with an existing 
shared lock in the DB, we need to implement revocation:
- add a record indicating that there's a waiter on the lock, blocked on the 
existing shared lock(s)
- send a revocation request to any HMS(s) holding the shared locks. In the case 
that they're just being held in "sticky" mode, they can be revoked immediately. 
If there is actually an active refcount, this will just enforce that new shared 
lockers need to wait instead of incrementing the refcount.
- in the case that an HMS holding a sticky lock has crashed or partitioned, we 
need to wait out the "lease" to expire before we can revoke its lock.

There's some trickiness to think through about client->HMS "stickiness" in HA 
scenarios, as well. Right now, the lock requests may be sent to a different HMS 
than the 'commit/abort' request for a transaction, but that could be difficult 
with "sticky locks".


All of the above is a bit complicated, so maybe a first step is to just look at 
some kind of stress test/benchmark and understand if we can do any changes to 
the way we manage the RDBMS table to be more efficient? Perhaps if we 
specialize the implementation for a specific RDBMS (eg postgres) we could get 
some benefits here (eg stored procedures to avoid round trips if that turns out 
to be the bottleneck)

> Memory based TxnHandler implementation
> --
>
> Key: HIVE-21506
> URL: https://issues.apache.org/jira/browse/HIVE-21506
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Peter Vary
>Priority: Major
>
> The current TxnHandler implementations are using the backend RDBMS to store 
> every Hive lock and transaction data, so multiple TxnHandler instances can 
> run simultaneously and can serve requests. The continuous 
> communication/locking done on the RDBMS side puts serious load on the backend 
> databases also restricts the possible throughput.
> If it is possible to have only a single active TxnHandler (with the current 
> design HMS) instance then we can provide much better (using only java based 
> locking) performance. We still have to store the committed write transactions 
> to the RDBMS (or later some other persistent storage), but other lock and 
> transaction operations could remain memory only.
> The most important drawbacks with this solution is that we definitely lose 
> scalability when one instance of TxnHandler is no longer able to serve the 
> requests (see NameNode), and fault tolerance in the sense that the ongoing 
> transactions should be terminated when the TxnHandler is failed. If this 
> drawbacks are acceptable in certain situations the we can provide better 
> throughput for the users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21506) Memory based TxnHandler implementation

2019-04-04 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810171#comment-16810171
 ] 

Todd Lipcon commented on HIVE-21506:


http://www.vldb.org/pvldb/2/vldb09-157.pdf is a good paper that talks about 
techniques like the above where a lock is temporallly extended across multiple 
transactions to reduce lock manager contention

> Memory based TxnHandler implementation
> --
>
> Key: HIVE-21506
> URL: https://issues.apache.org/jira/browse/HIVE-21506
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Peter Vary
>Priority: Major
>
> The current TxnHandler implementations are using the backend RDBMS to store 
> every Hive lock and transaction data, so multiple TxnHandler instances can 
> run simultaneously and can serve requests. The continuous 
> communication/locking done on the RDBMS side puts serious load on the backend 
> databases also restricts the possible throughput.
> If it is possible to have only a single active TxnHandler (with the current 
> design HMS) instance then we can provide much better (using only java based 
> locking) performance. We still have to store the committed write transactions 
> to the RDBMS (or later some other persistent storage), but other lock and 
> transaction operations could remain memory only.
> The most important drawbacks with this solution is that we definitely lose 
> scalability when one instance of TxnHandler is no longer able to serve the 
> requests (see NameNode), and fault tolerance in the sense that the ongoing 
> transactions should be terminated when the TxnHandler is failed. If this 
> drawbacks are acceptable in certain situations the we can provide better 
> throughput for the users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21596) HiveMetastoreClient should be able to connect to older metastore servers

2019-04-09 Thread Todd Lipcon (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-21596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Todd Lipcon updated HIVE-21596:
---
Description:
{{HiveMetastoreClient}} currently depends on the fact that both the client and
server versions are the same. Additionally, since the server APIs are backwards
compatible, it is possible for a older client (eg. 2.1.0 client version) to
connect to a newer server (eg. 3.1.0 server version) without any issues. This
is useful in setups where HMS is deployed in a remote mode and clients connect
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can
connect to the a older server version. When a newer client is talking to a
older server following things can happen:

1. Client invokes a RPC to the older server which doesn't exist.
In such a case, thrift will throw {{Invalid method name}} exception which
should be automatically be handled by the clients since each API already throws
TException.

2. Client invokes a RPC using thrift objects which has new fields added.
When a new field is added to a thrift object, the server does not deserialize
the field in the first place since it does not know about that field id. So the
wire-compatibility exists already. However, the client side application should
understand the implications of such a behavior. In such cases, it would be
better for the client to throw exception by checking the server version which
was added in HIVE-21484

3. If the newer client has re-implemented a certain API, for example, using a
newer more efficient thrift API, but an older thrift API also exists which can
provide the same functionality. In this case, the new client will start seeing
exception {{Invalid method name}} since the older server does not have such a
method. This can be handled on the client side by making sure that the newer
implementation is conditional to the server version, and falling back to the
older (maybe less-efficient) one when necessary. Which means client should
check the server version and invoke the new implementation only if the server
version supports the newer API. (On a side note, it would be great if metastore
also gives information of which APIs are supported for a given version)

One of the real world use-case of such a feature is in Impala which wants to
have capability to talk to both HMS 2.x and HMS 3.x. But other applications
like Spark (or third party applications which want to support multiple HMS
versions) may also find this useful.

was:
{{HiveMetastoreClient}} currently depends on the fact that both the client and
server versions are the same. Additionally, since the server APIs are backwards
compatible, it is possible for a older client (eg. 2.1.0 client version) to
connect to a newer server (eg. 3.1.0 server version) without any issues. This
is useful in setups where HMS is deployed in a remote mode and clients connect
to it remotely.

It would be a good improvement if a newer version {{HiveMetastoreClient }} can
connect to the a older server version. When a newer client is talking to a
older server following things can happen:

3. If the newer client has re-implemented a certain API, for example, using
newer thrift API the client will start seeing exception {{Invalid method name}}
since the older server does not have such a method.
This can be handled on the client side by making sure that the newer
implementation is conditional to the server version. Which means client should
check the server version and invoke the new implementation only if the server
version supports the newer API. (On a side note, it would be great if metastore
also gives information of which APIs are supported for a given version)

> HiveMetastoreClient should be able to connect to older metastore servers
>
>
> Key: HIVE-21596

[jira] [Commented] (HIVE-21506) Memory based TxnHandler implementation

2019-04-22 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16823504#comment-16823504
 ] 

Todd Lipcon commented on HIVE-21506:


bq.  My understanding is that we are not yet blocked by the concurrency checks 
when acquiring locks, but the bottleneck is simply the number of HMS/RDBMS 
calls implementing that.

Agreed with that, and the general idea that we should understand the workload. 
That said, I don't know that we need a specific workload to agree on the 
central observation that most queries against Hive are read-only, given our 
focus on warehousing and datamart applications (Hive isn't an OLTP database by 
any stretch). I did a spot check on the ratio of DML to read-only queries in 
some customer profile datasets I have, and they range from a 300:1 ratio for 
some customers down to about a 1:1 ratio. Average is 7:1. 

> Memory based TxnHandler implementation
> --
>
> Key: HIVE-21506
> URL: https://issues.apache.org/jira/browse/HIVE-21506
> Project: Hive
>  Issue Type: New Feature
>  Components: Transactions
>Reporter: Peter Vary
>Priority: Major
>
> The current TxnHandler implementations are using the backend RDBMS to store 
> every Hive lock and transaction data, so multiple TxnHandler instances can 
> run simultaneously and can serve requests. The continuous 
> communication/locking done on the RDBMS side puts serious load on the backend 
> databases also restricts the possible throughput.
> If it is possible to have only a single active TxnHandler (with the current 
> design HMS) instance then we can provide much better (using only java based 
> locking) performance. We still have to store the committed write transactions 
> to the RDBMS (or later some other persistent storage), but other lock and 
> transaction operations could remain memory only.
> The most important drawbacks with this solution is that we definitely lose 
> scalability when one instance of TxnHandler is no longer able to serve the 
> requests (see NameNode), and fault tolerance in the sense that the ongoing 
> transactions should be terminated when the TxnHandler is failed. If this 
> drawbacks are acceptable in certain situations the we can provide better 
> throughput for the users.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-1083) allow sub-directories for an external table/partition

2019-04-25 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16826522#comment-16826522
 ] 

Todd Lipcon commented on HIVE-1083:
---

It seems that recursive reading is the default now for Tez execution. Perhaps 
this can be closed.

> allow sub-directories for an external table/partition
> -
>
> Key: HIVE-1083
> URL: https://issues.apache.org/jira/browse/HIVE-1083
> Project: Hive
>  Issue Type: Improvement
>  Components: Query Processor
>Affects Versions: 0.6.0
>Reporter: Namit Jain
>Assignee: Zheng Shao
>Priority: Major
>  Labels: inputformat
>
> Sometimes users want to define an external table/partition based on all files 
> (recursively) inside a directory.
> Currently most of the Hadoop InputFormat classes do not support that. We 
> should extract all files recursively in the directory, and add them to the 
> input path of the job.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21675) CREATE VIEW IF NOT EXISTS broken

2019-05-01 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HIVE-21675:
--

Assignee: Todd Lipcon

> CREATE VIEW IF NOT EXISTS broken
> 
>
> Key: HIVE-21675
> URL: https://issues.apache.org/jira/browse/HIVE-21675
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
>
> CREATE VIEW IF NOT EXISTS returns an error rather than "OK" if the view 
> already exists. This is a regression from Hive 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21675) CREATE VIEW IF NOT EXISTS broken

2019-05-01 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831307#comment-16831307
 ] 

Todd Lipcon commented on HIVE-21675:


Looks like in hive 3.1 it throws an error if the view already exists (ignoring 
the IF NOT EXISTS)  clause. In trunk, it treats it as if it were a REPLACE.

> CREATE VIEW IF NOT EXISTS broken
> 
>
> Key: HIVE-21675
> URL: https://issues.apache.org/jira/browse/HIVE-21675
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Priority: Critical
>
> CREATE VIEW IF NOT EXISTS returns an error rather than "OK" if the view 
> already exists. This is a regression from Hive 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21675) CREATE VIEW IF NOT EXISTS broken

2019-05-01 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831313#comment-16831313
 ] 

Todd Lipcon commented on HIVE-21675:


Looks like HIVE-20462 already fixed this in Hive 3.x but the fix got the 
semantics wrong and it behaves as CREATE OR REPLACE

> CREATE VIEW IF NOT EXISTS broken
> 
>
> Key: HIVE-21675
> URL: https://issues.apache.org/jira/browse/HIVE-21675
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
>
> CREATE VIEW IF NOT EXISTS returns an error rather than "OK" if the view 
> already exists. This is a regression from Hive 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21675) CREATE VIEW IF NOT EXISTS broken

2019-05-01 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-21675:
---
Attachment: hive-21675.txt

> CREATE VIEW IF NOT EXISTS broken
> 
>
> Key: HIVE-21675
> URL: https://issues.apache.org/jira/browse/HIVE-21675
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hive-21675.txt
>
>
> CREATE VIEW IF NOT EXISTS returns an error rather than "OK" if the view 
> already exists. This is a regression from Hive 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21675) CREATE VIEW IF NOT EXISTS broken

2019-05-01 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-21675:
---
Status: Patch Available  (was: Open)

[~jcamachorodriguez] [~daijy] can you take a look? Seems you touched this area 
of the code most recently.

> CREATE VIEW IF NOT EXISTS broken
> 
>
> Key: HIVE-21675
> URL: https://issues.apache.org/jira/browse/HIVE-21675
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Attachments: hive-21675.txt
>
>
> CREATE VIEW IF NOT EXISTS returns an error rather than "OK" if the view 
> already exists. This is a regression from Hive 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21682) Concurrent queries in tez local mode fail

2019-05-02 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HIVE-21682:
--


> Concurrent queries in tez local mode fail
> -
>
> Key: HIVE-21682
> URL: https://issues.apache.org/jira/browse/HIVE-21682
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> As noted in TEZ-3420, Hive running with Tez local mode breaks if multiple 
> queries are submitted concurrently. As I noted 
> [there|https://issues.apache.org/jira/browse/TEZ-3420?focusedCommentId=16831937&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16831937]
>  it seems part of the problem is Hive's use of static global state for 
> IOContext in the case of Tez. Another issue is the use of a JVM-wide 
> ObjectRegistry



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (HIVE-21683) ProxyFileSystem breaks with Hadoop trunk

2019-05-02 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon reassigned HIVE-21683:
--


> ProxyFileSystem breaks with Hadoop trunk
> 
>
> Key: HIVE-21683
> URL: https://issues.apache.org/jira/browse/HIVE-21683
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
>
> When trying to run with a recent build of Hadoop which includes HADOOP-15229 
> I ran into the following stack:
> {code}
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> pfile:/src/hive/itests/qtest/target/warehouse/src/kv1.txt, expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:793) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:456)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:153)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:354) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.lambda$openFileWithOptions$0(ChecksumFileSystem.java:846)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.openFileWithOptions(ChecksumFileSystem.java:845)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FileSystem$FSDataInputStreamBuilder.build(FileSystem.java:4522)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:115) 
> ~[hadoop-mapreduce-client-core-3.1.1.6.0.99.0-135.jar:?]{code}
> We need to add appropriate path-swizzling wrappers for the new APIs in 
> ProxyFileSystem23



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21683) ProxyFileSystem breaks with Hadoop trunk

2019-05-02 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-21683:
---
Attachment: hive-21683-javassist.patch

> ProxyFileSystem breaks with Hadoop trunk
> 
>
> Key: HIVE-21683
> URL: https://issues.apache.org/jira/browse/HIVE-21683
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive-21683-javassist.patch
>
>
> When trying to run with a recent build of Hadoop which includes HADOOP-15229 
> I ran into the following stack:
> {code}
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> pfile:/src/hive/itests/qtest/target/warehouse/src/kv1.txt, expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:793) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:456)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:153)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:354) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.lambda$openFileWithOptions$0(ChecksumFileSystem.java:846)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.openFileWithOptions(ChecksumFileSystem.java:845)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FileSystem$FSDataInputStreamBuilder.build(FileSystem.java:4522)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:115) 
> ~[hadoop-mapreduce-client-core-3.1.1.6.0.99.0-135.jar:?]{code}
> We need to add appropriate path-swizzling wrappers for the new APIs in 
> ProxyFileSystem23



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-21683) ProxyFileSystem breaks with Hadoop trunk

2019-05-02 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-21683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16832242#comment-16832242
 ] 

Todd Lipcon commented on HIVE-21683:


Attached is one approach that uses javassist to dynamically wrap any newly 
added methods for ProxyFileSystem23. I verified this fixes the issue.

I'll also momentarily attach a more "straightforward" approach which just adds 
the new method. The problem with this latter approach is that it won't compile 
against Hadoop 3.1, since the new methods are in Hadoop 3.3 (not yet released). 
We could just wait until that releases before committing if we want to go with 
the simpler approach, though.

> ProxyFileSystem breaks with Hadoop trunk
> 
>
> Key: HIVE-21683
> URL: https://issues.apache.org/jira/browse/HIVE-21683
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive-21683-javassist.patch
>
>
> When trying to run with a recent build of Hadoop which includes HADOOP-15229 
> I ran into the following stack:
> {code}
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> pfile:/src/hive/itests/qtest/target/warehouse/src/kv1.txt, expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:793) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:456)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:153)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:354) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.lambda$openFileWithOptions$0(ChecksumFileSystem.java:846)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.openFileWithOptions(ChecksumFileSystem.java:845)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FileSystem$FSDataInputStreamBuilder.build(FileSystem.java:4522)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:115) 
> ~[hadoop-mapreduce-client-core-3.1.1.6.0.99.0-135.jar:?]{code}
> We need to add appropriate path-swizzling wrappers for the new APIs in 
> ProxyFileSystem23



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-21683) ProxyFileSystem breaks with Hadoop trunk

2019-05-02 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-21683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-21683:
---
Attachment: hive-21683-simple.patch

> ProxyFileSystem breaks with Hadoop trunk
> 
>
> Key: HIVE-21683
> URL: https://issues.apache.org/jira/browse/HIVE-21683
> Project: Hive
>  Issue Type: Bug
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive-21683-javassist.patch, hive-21683-simple.patch
>
>
> When trying to run with a recent build of Hadoop which includes HADOOP-15229 
> I ran into the following stack:
> {code}
> Caused by: java.lang.IllegalArgumentException: Wrong FS: 
> pfile:/src/hive/itests/qtest/target/warehouse/src/kv1.txt, expected: file:///
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:793) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:456)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.(ChecksumFileSystem.java:153)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:354) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.lambda$openFileWithOptions$0(ChecksumFileSystem.java:846)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52) 
> ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.ChecksumFileSystem.openFileWithOptions(ChecksumFileSystem.java:845)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.fs.FileSystem$FSDataInputStreamBuilder.build(FileSystem.java:4522)
>  ~[hadoop-common-3.1.1.6.0.99.0-135.jar:?]
> at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:115) 
> ~[hadoop-mapreduce-client-core-3.1.1.6.0.99.0-135.jar:?]{code}
> We need to add appropriate path-swizzling wrappers for the new APIs in 
> ProxyFileSystem23



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name

2018-05-22 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484623#comment-16484623
 ] 

Todd Lipcon commented on HIVE-19605:


It seems like this table can also be called from a get_table call. Oddly, the 
query being generated is:

SELECT 'org.apache.hadoop.hive.metastore.model.MTableColumnStatistics' AS 
NUCLEUS_TYPE,`A0`.`AVG_COL_LEN`,`A0`.`COLUMN_NAME`,`A0`.`COLUMN_TYPE`,`A0`.`DB_NAME`,`A0`.`BIG_DECIMAL_HIGH_VALUE`,`A0`.`BIG_DECIMAL_LOW_VALUE`,`A0`.`DOUBLE_HIGH_VALUE`,`A0`.`DOUBLE_LOW_VALUE`,`A0`.`LAST_ANALYZED`,`A0`.`LONG_HIGH_VALUE`,`A0`.`LONG_LOW_VALUE`,`A0`.`MAX_COL_LEN`,`A0`.`NUM_DISTINCTS`,`A0`.`NUM_FALSES`,`A0`.`NUM_NULLS`,`A0`.`NUM_TRUES`,`A0`.`TABLE_NAME`,`A0`.`CS_ID`
 FROM `TAB_COL_STATS` `A0` WHERE `A0`.`DB_NAME` = '';

(note the empty db_name).

Given the lack of index, this takes 450ms on the HMS instance I am testing (if 
the mysql query cache is disabled)

> TAB_COL_STATS table has no index on db/table name
> -
>
> Key: HIVE-19605
> URL: https://issues.apache.org/jira/browse/HIVE-19605
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Todd Lipcon
>Priority: Major
>
> The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, 
> TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. 
> This makes those queries take a significant amount of time in large 
> metastores since they do a full table scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name

2018-05-22 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484648#comment-16484648
 ] 

Todd Lipcon commented on HIVE-19605:


Upon further inspection it seems the above query is likely generated during the 
initialization of ObjectStore, not directly within the get_table call. So, any 
call can end up making this query and generating a big outlier.

> TAB_COL_STATS table has no index on db/table name
> -
>
> Key: HIVE-19605
> URL: https://issues.apache.org/jira/browse/HIVE-19605
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Todd Lipcon
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, 
> TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. 
> This makes those queries take a significant amount of time in large 
> metastores since they do a full table scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name

2018-05-22 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16484687#comment-16484687
 ] 

Todd Lipcon commented on HIVE-19605:


Seems the slow query on get_table is due to the "initialization" queries in 
ensureDbInit(). This is worked around by HIVE-19310.

However, the API calls that actually are meant to fetch column stats are still 
slow and this should be fixed for their sake.

> TAB_COL_STATS table has no index on db/table name
> -
>
> Key: HIVE-19605
> URL: https://issues.apache.org/jira/browse/HIVE-19605
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Todd Lipcon
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, 
> TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. 
> This makes those queries take a significant amount of time in large 
> metastores since they do a full table scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19685) OpenTracing support for HMS

2018-05-23 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-19685:
---
Attachment: trace.png

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Priority: Major
> Attachments: trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19685) OpenTracing support for HMS

2018-05-23 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488129#comment-16488129
 ] 

Todd Lipcon commented on HIVE-19685:


Following is a screenshot of the trace view for the simple integration I've 
been playing with:
!trace.png! 

This makes it very easy to find issues like HIVE-19605 since the slow queries 
stick out like a sore thumb. It's also helpful for clients of the HMS who have 
also integrated opentracing -- it's easier to see how much of the total 
operation time of a slow request can be attributed to the HMS vs other 
contributing factors.

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Priority: Major
> Attachments: trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19685) OpenTracing support for HMS

2018-05-23 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488139#comment-16488139
 ] 

Todd Lipcon commented on HIVE-19685:


Before working on a patch against trunk, wanted to run a summary of the design 
by folks:

- add POM dependencies from the metastore module to opentracing
-- the "opentracing" project itself is just APIs, not coupled to any tracing 
implementation. You can think of it like Slf4j where the user has to provide an 
implementation of their choice. opentracing supports a number of 
implementations, the most popular being Jaeger (from Uber) and Zipkin (from 
Twitter) as well as a commercial implementation provided by LightStep.
-- we'd also include the 'tracerresolver' module. This uses a Java 
ServiceLoader to look for appropriate plugins on the classpath at start time. 
This would allow a user to drop Jaeger or Zipkin onto the classpath and enable 
tracing without recompilation. The tracing implementation's configuration is 
implementation-specific. For example, Jaeger's configuration is by environment 
variables.
- add a POM dependency to opentracing-thrift, which is some simple utility code 
to wrap a TProtocol and TProcessor so that the client and server propagate a 
trace context between them. This allows a trace to be correlated between two 
processes (eg HS2 and HMS). We might want to shade these classes since they'd 
show up in consumer classpaths who are using the HMS client.

In order to get the tracing of JDBC calls as shown in the screenshot above, no 
code is necessary. The user just adds the opentracing-jdbc jar to their 
classpath and then appropriately configures their JDBC connection string. It 
acts like a "passthrough" to the underlying JDBC driver.

The above is the basic integration. Beyond that, we can add small bits of 
instrumentation to interesting points of the code. For example:

{code}
   private boolean ensureDbInit() {
   try (Scope s = 
GlobalTracer.get().buildSpan("MetaStoreDirectSQL.ensureDbInit")
 .startActive(true)) {
 guts of method ...
+}
   }
{code}

this makes it easy to spot issues like HIVE-19310.

Thoughts?

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Priority: Major
> Attachments: trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19685) OpenTracing support for HMS

2018-05-23 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488142#comment-16488142
 ] 

Todd Lipcon commented on HIVE-19685:


[~prasanth_j] sorry, didn't see your question as it came while I was writing 
the above comment. I think I partially answered it, but I'll give slightly more 
color:

- the default configuration would be a "no-op" tracer, which should have no 
measurable overhead.
- if you drop one of the tracer implementations onto the classpath, it's up to 
you to configure it and provide your own trace collection infrastructure. In 
the case of Jaeger, for example, I've been using a configuration like:

{code}
export JAEGER_SERVICE_NAME=hms
export JAEGER_AGENT_HOST=my-machine.example.com
export JAEGER_AGENT_PORT=6831
export JAEGER_REPORTER_FLUSH_INTERVAL=1000
export JAEGER_SAMPLER_TYPE=const
export JAEGER_SAMPLER_PARAM=1
{code}

And on my-machine.example.com I run some docker images provided by the Jaeger 
community. The simplest docker image they provide uses an in-memory store, but 
it can also write to Cassandra or Elastic Search as backends. It also provides 
the UI as seen in my screenshot.

Personally I've found this very useful to understand HMS performance issues 
during development, but I'm not sure if many end-users who deploy Hive would 
bother to set it up. IMO that's OK -- we can treat it as a dev-only feature 
without adding much maintenance burden.

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Priority: Major
> Attachments: trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19685) OpenTracing support for HMS

2018-05-23 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-19685:
---
Attachment: hive-19685.patch

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Priority: Major
> Attachments: hive-19685.patch, trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19685) OpenTracing support for HMS

2018-05-23 Thread Todd Lipcon (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-19685:
---
Assignee: Todd Lipcon
  Status: Patch Available  (was: Open)

Attached patch does the most basic integration here. It instruments the Thrift 
protocol so that it propagates traces from clients and creates spans around the 
thrift calls themselves.

Using opentracing's JDBC driver you can also get the query traces "for free".

Here's a set of steps you can use to try it out:

1. Build the metastore as normal
2. cd into the bin/ target directory

{code}
mvn dependency:get -Dartifact=io.jaegertracing:jaeger-tracerresolver:0.27.0 
-Ddest=lib/
mvn dependency:get -Dartifact=io.jaegertracing:jaeger-core:0.27.0 -Ddest=lib/
mvn dependency:get -Dartifact=io.jaegertracing:jaeger-thrift:0.27.0 -Ddest=lib/
mvn dependency:get -Dartifact=io.opentracing.contrib:opentracing-jdbc:0.0.6 
-Ddest=lib/

export JAEGER_SERVICE_NAME=hms
export JAEGER_AGENT_HOST=localhost
export JAEGER_AGENT_PORT=6831
export JAEGER_REPORTER_LOG_SPANS=1
export JAEGER_REPORTER_FLUSH_INTERVAL=1000
export JAEGER_SAMPLER_TYPE=const
export JAEGER_SAMPLER_PARAM=1

 docker run -d -e \
  COLLECTOR_ZIPKIN_HTTP_PORT=9411 \
  -p 5775:5775/udp \
  -p 6831:6831/udp \
  -p 6832:6832/udp \
  -p 5778:5778 \
  -p 16686:16686 \
  -p 14268:14268 \
  -p 9411:9411 \
  jaegertracing/all-in-one:latest

./bin/schematool -initSchema -dbType derby
bin/start-metastore  \
  -hiveconf 
'javax.jdo.option.ConnectionURL=jdbc:tracing:derby:;databaseName=metastore_db;create=true'
 \ 
  -hiveconf 
javax.jdo.option.ConnectionDriverName=io.opentracing.contrib.jdbc.TracingDriver
{code}

If you navigate to http://localhost:16686 you should see the jaeger UI. You can 
then run some thrift calls against the HMS and you should see the resulting 
traces.

I wasn't sure if it was worth adding explicit new unit tests for this. If you 
think so, let me know. There is some kind of MockTracer implementation we could 
use for testing.

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive-19685.patch, trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19685) OpenTracing support for HMS

2018-05-25 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491000#comment-16491000
 ] 

Todd Lipcon commented on HIVE-19685:


Thanks for the review [~prasanth_j]. Is there something I need to do to trigger 
a precommit build here? (Note I'm not a committer so will need your help to 
commit).

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive-19685.patch, trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19715) Consolidated and flexible API for fetching partition metadata from HMS

2018-05-25 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491149#comment-16491149
 ] 

Todd Lipcon commented on HIVE-19715:


bq. the response should be designed in such a way as to avoid transferring 
redundant information for common cases (eg simple "dictionary coding" of 
strings like parameter names, etc)

To elaborate on this, the idea is that the response could have a 'list 
string_pool' member at the top level. Underlying partition info like storage 
descriptor input formats, serde names, parameters, etc, can use integer indexes 
into the string_pool. This can likely reduce the size of responses on the wire 
as well as memory/GC/CPU costs while deserializing.

> Consolidated and flexible API for fetching partition metadata from HMS
> --
>
> Key: HIVE-19715
> URL: https://issues.apache.org/jira/browse/HIVE-19715
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Todd Lipcon
>Priority: Major
>
> Currently, the HMS thrift API exposes 17 different APIs for fetching 
> partition-related information. There is somewhat of a combinatorial explosion 
> going on, where each API has variants with and without "auth" info, by pspecs 
> vs names, by filters, by exprs, etc. Having all of these separate APIs long 
> term is a maintenance burden and also more confusing for consumers.
> Additionally, even with all of these APIs, there is a lack of granularity in 
> fetching only the information needed for a particular use case. For example, 
> in some use cases it may be beneficial to only fetch the partition locations 
> without wasting effort fetching statistics, etc.
> This JIRA proposes that we add a new "one API to rule them all" for fetching 
> partition info. The request and response would be encapsulated in structs. 
> Some desirable properties:
> - the request should be able to specify which pieces of information are 
> required (eg location, properties, etc)
> - in the case of partition parameters, the request should be able to do 
> either whitelisting or blacklisting (eg to exclude large incremental column 
> stats HLL dumped in there by Impala)
> - the request should optionally specify auth info (to encompas the 
> "with_auth" variants)
> - the request should be able to designate the set of partitions to access 
> through one of several different methods (eg "all", list, expr, 
> part_vals, etc) 
> - the struct should be easily evolvable so that new pieces of info can be 
> added
> - the response should be designed in such a way as to avoid transferring 
> redundant information for common cases (eg simple "dictionary coding" of 
> strings like parameter names, etc)
> - the API should support some form of pagination for tables with large 
> partition counts



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19685) OpenTracing support for HMS

2018-05-29 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493771#comment-16493771
 ] 

Todd Lipcon commented on HIVE-19685:


OK, re-uploaded.

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive-19685.patch, hive-19685.patch, trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (HIVE-19685) OpenTracing support for HMS

2018-05-29 Thread Todd Lipcon (JIRA)



 [ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HIVE-19685:
---
Attachment: hive-19685.patch

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: hive-19685.patch, hive-19685.patch, trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19715) Consolidated and flexible API for fetching partition metadata from HMS

2018-05-29 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493797#comment-16493797
 ] 

Todd Lipcon commented on HIVE-19715:


bq. I also think we should at-least deprecate the older APIs so that clients 
can move to the newer APIs in the near future

Agreed, but I think realistically you can basically never remove a wire API 
that has been in use for so long with so many integrations. Marking it as 
deprecated is fine but my guess is we're stuck with our existing set for many 
more years even if the major ecosystem consumers move to the new one.

bq. handling of partition expressions - it would be great to avoid sending 
serialized Java classes and UDFs via Thrift API.

Agreed with that. Given that defining a language-agnostic way to pass arbitrary 
expression trees over Thrift is probably a larger project, though, I think we 
should aim to leave that out of the initial scope for this API. Given the API 
would be designed such that there are several options for specifying the 
filtering criteria, it shouldn't be too bad to add it in as a "v2".

Perhaps for "V1" we could just support the most commonly used simple criteria 
like equality or range predicates on individual columns. It seems that Presto 
for example only uses that functionality anyway today 
(https://github.com/prestodb/presto/blob/0.179/presto-hive/src/main/java/com/facebook/presto/hive/HivePartitionManager.java#L215)

bq. One interesting side-effect of returning only subset of interesting fields 
of the partition objects is we probably will have to change the partition 
fields as optional instead of the required. This can create a trickle down 
effect all the way down to the database and I am not sure what complications 
can it cause. Thoughts?

Moving Thrift fields from "required" to "optional" is allowed so long as you 
don't try to send an object with a missing field to an old client that still 
thinks it is "required". That would cause the old client to fail. So, as long 
as we ensure that the existing APIs continue to set all fields that are marked 
"required" today, it would not be a wire-breaking change to downgrade them to 
optional and conditionally fill them in in the new API.

As for "trickle down to the database", I'm not sure I follow. On any API calls 
that today take a Partition object as an input (eg add_partition, 
alter_partition, etc) we'd need to add validation to ensure that all fields 
that are expected are set, whereas today we rely on Thrift to do so. But that 
should be all, right?

bq. Are you proposing Thrift version of Java interning?

Somewhat, yea -- or just a "symbol table" or "string table" if you will.

bq. Should we also have a unified way to send list of locations as a path trie 
(or some other compressed form)?

That's an interesting idea, though it certainly increases complexity since 
clients would also need to decode the trie. Do you think we have end-user 
clients who could consume the strings in their trie-encoded form, or would we 
have to just "decompress" them within the Metastore Client code and provide the 
end user with complete Strings? If the latter, I'm not sure if there would be a 
big gain vs compressing the whole thrift object on the wire with something like 
LZ4 or Snappy.


> Consolidated and flexible API for fetching partition metadata from HMS
> --
>
> Key: HIVE-19715
> URL: https://issues.apache.org/jira/browse/HIVE-19715
> Project: Hive
>  Issue Type: New Feature
>  Components: Standalone Metastore
>Reporter: Todd Lipcon
>Assignee: Vihang Karajgaonkar
>Priority: Major
>
> Currently, the HMS thrift API exposes 17 different APIs for fetching 
> partition-related information. There is somewhat of a combinatorial explosion 
> going on, where each API has variants with and without "auth" info, by pspecs 
> vs names, by filters, by exprs, etc. Having all of these separate APIs long 
> term is a maintenance burden and also more confusing for consumers.
> Additionally, even with all of these APIs, there is a lack of granularity in 
> fetching only the information needed for a particular use case. For example, 
> in some use cases it may be beneficial to only fetch the partition locations 
> without wasting effort fetching statistics, etc.
> This JIRA proposes that we add a new "one API to rule them all" for fetching 
> partition info. The request and response would be encapsulated in structs. 
> Some desirable properties:
> - the request should be able to specify which pieces of information are 
> required (eg location, properties, etc)
> - in the case of partition parameters, the request should be able to do 
> either whitelisting or blacklisting (eg to exclude large incremental column 
> stats HLL dumped in

[jira] [Commented] (HIVE-19605) TAB_COL_STATS table has no index on db/table name

2018-05-30 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16495532#comment-16495532
 ] 

Todd Lipcon commented on HIVE-19605:


Ah, yea, you're right, thanks.

> TAB_COL_STATS table has no index on db/table name
> -
>
> Key: HIVE-19605
> URL: https://issues.apache.org/jira/browse/HIVE-19605
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Todd Lipcon
>Assignee: Vihang Karajgaonkar
>Priority: Major
> Attachments: HIVE-19605.01.patch
>
>
> The TAB_COL_STATS table is missing an index on (CAT_NAME, DB_NAME, 
> TABLE_NAME). The getTableColumnStatistics call queries based on this tuple. 
> This makes those queries take a significant amount of time in large 
> metastores since they do a full table scan.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-19685) OpenTracing support for HMS

2018-06-05 Thread Todd Lipcon (JIRA)



[ 
https://issues.apache.org/jira/browse/HIVE-19685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16502374#comment-16502374
 ] 

Todd Lipcon commented on HIVE-19685:


Thanks Vihang, was OOO yesterday.

> OpenTracing support for HMS
> ---
>
> Key: HIVE-19685
> URL: https://issues.apache.org/jira/browse/HIVE-19685
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Major
> Attachments: HIVE-19685.02.patch, hive-19685.patch, hive-19685.patch, 
> trace.png
>
>
> When diagnosing performance of metastore operations it isn't always obvious 
> why something took a long time. Using a tracing framework can provide an 
> end-to-end view of an operation including time spent in dependent systems (eg 
> filesystem operations, RDBMS queries, etc). This JIRA proposes to integrate 
> OpenTracing, which is a vendor-neutral tracing API into the HMS server and 
> client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-12971) Hive Support for Kudu

2016-02-01 Thread Todd Lipcon (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-12971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15126904#comment-15126904
 ] 

Todd Lipcon commented on HIVE-12971:


Great to see this JIRA. Let us know how the Kudu dev community can help you 
guys out with this.

> Hive Support for Kudu
> -
>
> Key: HIVE-12971
> URL: https://issues.apache.org/jira/browse/HIVE-12971
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 2.0.0
>Reporter: Lenni Kuff
>
> JIRA for tracking work related to Hive/Kudu integration.
> It would be useful to allow Kudu data to be accessible via Hive. This would 
> involve creating a Kudu SerDe/StorageHandler and implementing support for 
> QUERY and DML commands like SELECT, INSERT, UPDATE, and DELETE. Kudu 
> Input/OutputFormats classes already exist. The work can be staged to support 
> this functionality incrementally.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

45 matches

Mail list logo