[jira] [Commented] (HIVE-12508) HiveAggregateJoinTransposeRule places a heavy load on the metadata system
[ https://issues.apache.org/jira/browse/HIVE-12508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025528#comment-15025528 ] Jesus Camacho Rodriguez commented on HIVE-12508: [~jpullokkaran], wouldn't it be safer to add this fix too till CALCITE-794 is fixed? Otherwise, I will close it as duplicate. > HiveAggregateJoinTransposeRule places a heavy load on the metadata system > - > > Key: HIVE-12508 > URL: https://issues.apache.org/jira/browse/HIVE-12508 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12508.patch > > > Finding out whether the input is already unique requires a call to > areColumnsUnique that currently (until CALCITE-794 is fixed) places a heavy > load on the metadata system. This can lead to long CBO planning. > This is a temporary fix that avoid the call to the method till then. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12508) HiveAggregateJoinTransposeRule places a heavy load on the metadata system
[ https://issues.apache.org/jira/browse/HIVE-12508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025581#comment-15025581 ] Laljo John Pullokkaran commented on HIVE-12508: --- Can we run in to CALCITE-794 with HIVE-12503 patch? My understanding is CALCITE-794 is not an issue with meta data systems itself rather its an issue when rule fires repeatedly on the same node. > HiveAggregateJoinTransposeRule places a heavy load on the metadata system > - > > Key: HIVE-12508 > URL: https://issues.apache.org/jira/browse/HIVE-12508 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12508.patch > > > Finding out whether the input is already unique requires a call to > areColumnsUnique that currently (until CALCITE-794 is fixed) places a heavy > load on the metadata system. This can lead to long CBO planning. > This is a temporary fix that avoid the call to the method till then. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)
[ https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025654#comment-15025654 ] Sergey Shelukhin edited comment on HIVE-12341 at 11/24/15 11:11 PM: Sorry, please diff revisions 2 and 5 on RB, 3 contains generated code and in 4 I forgot a file 0_o was (Author: sershe): Sorry, please diff revisions 2 and -4- 5 on RB, 3 contains generated code and in 4 I forgot a file 0_o > LLAP: add security to daemon protocol endpoint (excluding shuffle) > -- > > Key: HIVE-12341 > URL: https://issues.apache.org/jira/browse/HIVE-12341 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, > HIVE-12341.03.patch, HIVE-12341.03.patch, HIVE-12341.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)
[ https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025654#comment-15025654 ] Sergey Shelukhin edited comment on HIVE-12341 at 11/24/15 11:11 PM: Sorry, please diff revisions 2 and -4- 5 on RB, 3 contains generated code and in 4 I forgot a file 0_o was (Author: sershe): Sorry, please diff revisions 2 and 4 on RB, 3 contains generated code > LLAP: add security to daemon protocol endpoint (excluding shuffle) > -- > > Key: HIVE-12341 > URL: https://issues.apache.org/jira/browse/HIVE-12341 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, > HIVE-12341.03.patch, HIVE-12341.03.patch, HIVE-12341.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12307) Streaming API TransactionBatch.close() must abort any remaining transactions in the batch
[ https://issues.apache.org/jira/browse/HIVE-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-12307: -- Attachment: HIVE-12307.2.patch [~alangates], I uploaded a new patch with refactored write() and using your wrapper/delegator idea, which makes things look much cleaner. > Streaming API TransactionBatch.close() must abort any remaining transactions > in the batch > - > > Key: HIVE-12307 > URL: https://issues.apache.org/jira/browse/HIVE-12307 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 0.14.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-12307.2.patch, HIVE-12307.patch > > > When the client of TransactionBatch API encounters an error it must close() > the batch and start a new one. This prevents attempts to continue writing to > a file that may damaged in some way. > The close() should ensure to abort the any txns that still remain in the > batch and close (best effort) all the files it's writing to. The batch > should also put itself into a mode where any future ops on this batch fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9642) Hive metastore client retries don't happen consistently for all api calls
[ https://issues.apache.org/jira/browse/HIVE-9642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025733#comment-15025733 ] Thejas M Nair commented on HIVE-9642: - +1 Thanks for also adding the test case. Note for others- this new patch also addresses the cases where MetaStoreClient constructor has errors. > Hive metastore client retries don't happen consistently for all api calls > - > > Key: HIVE-9642 > URL: https://issues.apache.org/jira/browse/HIVE-9642 > Project: Hive > Issue Type: Bug >Affects Versions: 1.0.0 >Reporter: Xiaobing Zhou >Assignee: Daniel Dai > Attachments: HIVE-9642.1.patch, HIVE-9642.2.patch, HIVE-9642.3.patch > > > When org.apache.thrift.transport.TTransportException is thrown for issues > like socket timeout, the retry via RetryingMetaStoreClient happens only in > certain cases. > Retry happens for the getDatabase call in but not for getAllDatabases(). > The reason is RetryingMetaStoreClient checks for TTransportException being > the cause for InvocationTargetException. But in case of some calls such as > getAllDatabases in HiveMetastoreClient, all exceptions get wrapped in a > MetaException. We should remove this unnecessary wrapping of exceptions for > certain functions in HMC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12020) Revert log4j2 xml configuration to properties based configuration
[ https://issues.apache.org/jira/browse/HIVE-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025596#comment-15025596 ] Sergey Shelukhin commented on HIVE-12020: - +1 pending tests... I didn't check, I assume none of the property files changed logically thru both transitions (to XML and back) > Revert log4j2 xml configuration to properties based configuration > - > > Key: HIVE-12020 > URL: https://issues.apache.org/jira/browse/HIVE-12020 > Project: Hive > Issue Type: Sub-task > Components: Logging >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12020.1.patch, HIVE-12020.2.patch > > > Log4j 2.4 release brought back properties based configuration. We should > revert XML based configuration and use properties based configuration instead > (less verbose and will be similar to old log4j properties). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)
[ https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12341: Attachment: HIVE-12341.03.patch > LLAP: add security to daemon protocol endpoint (excluding shuffle) > -- > > Key: HIVE-12341 > URL: https://issues.apache.org/jira/browse/HIVE-12341 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, > HIVE-12341.03.patch, HIVE-12341.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC
[ https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025661#comment-15025661 ] Siddharth Seth commented on HIVE-12510: --- If the NDC is taking care of this - setting it in the thread name isn't required. > LLAP: Append attempt id either to thread name or NDC > > > Key: HIVE-12510 > URL: https://issues.apache.org/jira/browse/HIVE-12510 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Currently, in LLAP attempt id gets appended to both thread name and added to > NDC creating long log lines like below > {code} > [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]] > {code} > I think it will be sufficient to add only to NDC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)
[ https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12341: Attachment: HIVE-12341.03.patch > LLAP: add security to daemon protocol endpoint (excluding shuffle) > -- > > Key: HIVE-12341 > URL: https://issues.apache.org/jira/browse/HIVE-12341 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, > HIVE-12341.03.patch, HIVE-12341.03.patch, HIVE-12341.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC
[ https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025779#comment-15025779 ] Prasanth Jayachandran commented on HIVE-12510: -- Fixed in .3 patch of HIVE-12020. Attempt id will only set in NDC now. > LLAP: Append attempt id either to thread name or NDC > > > Key: HIVE-12510 > URL: https://issues.apache.org/jira/browse/HIVE-12510 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Currently, in LLAP attempt id gets appended to both thread name and added to > NDC creating long log lines like below > {code} > [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]] > {code} > I think it will be sufficient to add only to NDC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12020) Revert log4j2 xml configuration to properties based configuration
[ https://issues.apache.org/jira/browse/HIVE-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12020: - Attachment: HIVE-12020.3.patch One more change related to LLAP. HIVE-12510 is addressed in this patch. Removed the attempt id from TezTaskRunner > Revert log4j2 xml configuration to properties based configuration > - > > Key: HIVE-12020 > URL: https://issues.apache.org/jira/browse/HIVE-12020 > Project: Hive > Issue Type: Sub-task > Components: Logging >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12020.1.patch, HIVE-12020.2.patch, > HIVE-12020.3.patch > > > Log4j 2.4 release brought back properties based configuration. We should > revert XML based configuration and use properties based configuration instead > (less verbose and will be similar to old log4j properties). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC
[ https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12510: - Fix Version/s: 2.0.0 > LLAP: Append attempt id either to thread name or NDC > > > Key: HIVE-12510 > URL: https://issues.apache.org/jira/browse/HIVE-12510 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.0.0 > > > Currently, in LLAP attempt id gets appended to both thread name and added to > NDC creating long log lines like below > {code} > [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]] > {code} > I think it will be sufficient to add only to NDC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12020) Revert log4j2 xml configuration to properties based configuration
[ https://issues.apache.org/jira/browse/HIVE-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025612#comment-15025612 ] Prasanth Jayachandran commented on HIVE-12020: -- Let me list down the changes 1) Datanucleus related logging have changed. Earlier man specific datanucleus loggers were explicitly added. In this patch, all top level data nucleus loggers are added (no need for specific loggers). Discussed with Sushanth about it and he said the new change is good. If we want specific logger changes we can do it later. 2) In log4j2.xml files, I have mistakenly added %x to pattern layout that will be used by NDC. I don't think anything other than llap uses NDC so %x is added only to llap properties. 3) Log4j version updated to 2.4.1 to workaround NPE with empty loggers 4) HiveEventCounter has been removed from root logger configuration. It was added by default and I don't think it is of much significance. It publishes count of msgs logged at different log levels to hadoop metrics. But I don't see any configurations for hadoop-metrics in hive source. If required, this can also be added back. Other than these changes, it's pretty much the same one-to-one copy from log4j2.xml. > Revert log4j2 xml configuration to properties based configuration > - > > Key: HIVE-12020 > URL: https://issues.apache.org/jira/browse/HIVE-12020 > Project: Hive > Issue Type: Sub-task > Components: Logging >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12020.1.patch, HIVE-12020.2.patch > > > Log4j 2.4 release brought back properties based configuration. We should > revert XML based configuration and use properties based configuration instead > (less verbose and will be similar to old log4j properties). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)
[ https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025654#comment-15025654 ] Sergey Shelukhin commented on HIVE-12341: - Sorry, please diff revisions 2 and 4 on RB, 3 contains generated code > LLAP: add security to daemon protocol endpoint (excluding shuffle) > -- > > Key: HIVE-12341 > URL: https://issues.apache.org/jira/browse/HIVE-12341 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, > HIVE-12341.03.patch, HIVE-12341.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12513) Change LlapTokenIdentifier to use protbuf
[ https://issues.apache.org/jira/browse/HIVE-12513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025716#comment-15025716 ] Sergey Shelukhin commented on HIVE-12513: - Token is part of Hadoop security and is Writable. What we have is token identifier; that also has to be writable so that Hadoop security could write it, for all the basic parts of TokenIdentifier we inherit from delegation token indentifier.. We can use protobuf for our stuff and just write bytes into writable (currently we add nothing to the basic token though), but basic token and superclass identifier have to stay writable. > Change LlapTokenIdentifier to use protbuf > - > > Key: HIVE-12513 > URL: https://issues.apache.org/jira/browse/HIVE-12513 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Siddharth Seth > > Follow up to HIVE-12341. Currently writable, which can get in the way of > upgrades. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12513) Change LlapTokenIdentifier to use protobuf
[ https://issues.apache.org/jira/browse/HIVE-12513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-12513: -- Summary: Change LlapTokenIdentifier to use protobuf (was: Change LlapTokenIdentifier to use protbuf) > Change LlapTokenIdentifier to use protobuf > -- > > Key: HIVE-12513 > URL: https://issues.apache.org/jira/browse/HIVE-12513 > Project: Hive > Issue Type: Improvement > Components: llap >Reporter: Siddharth Seth > > Follow up to HIVE-12341. Currently writable, which can get in the way of > upgrades. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12513) Change LlapTokenIdentifier to use protobuf
[ https://issues.apache.org/jira/browse/HIVE-12513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025736#comment-15025736 ] Siddharth Seth commented on HIVE-12513: --- Writable just required bytes to be written and read back in. A protobuf instance wrapped in a writable can be used (used in Hadoop to allow for changes to the token across versions). Essentially, the serialized bytes end up being interpreted as Protobuf - with support for unknown fields etc. > Change LlapTokenIdentifier to use protobuf > -- > > Key: HIVE-12513 > URL: https://issues.apache.org/jira/browse/HIVE-12513 > Project: Hive > Issue Type: Improvement > Components: llap, Security >Reporter: Siddharth Seth > > Follow up to HIVE-12341. Currently writable, which can get in the way of > upgrades. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12512) Include driver logs in execution-level Operation logs
[ https://issues.apache.org/jira/browse/HIVE-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohit Sabharwal updated HIVE-12512: --- Attachment: HIVE-12512.patch > Include driver logs in execution-level Operation logs > - > > Key: HIVE-12512 > URL: https://issues.apache.org/jira/browse/HIVE-12512 > Project: Hive > Issue Type: Bug > Components: Logging >Reporter: Mohit Sabharwal >Assignee: Mohit Sabharwal >Priority: Minor > Attachments: HIVE-12512.patch > > > When {{hive.server2.logging.operation.level}} is set to {{EXECUTION}} > (default), operation logs do not include Driver logs, which contain useful > info like total number of jobs launched, stage getting executed, etc. that > help track high-level progress. It only adds a few more lines to the output. > {code} > 15/11/24 14:09:12 INFO ql.Driver: Semantic Analysis Completed > 15/11/24 14:09:12 INFO ql.Driver: Starting > command(queryId=hive_20151124140909_e8cbb9bd-bac0-40b2-83d0-382de25b80d1): > select count(*) from sample_08 > 15/11/24 14:09:12 INFO ql.Driver: Query ID = > hive_20151124140909_e8cbb9bd-bac0-40b2-83d0-382de25b80d1 > 15/11/24 14:09:12 INFO ql.Driver: Total jobs = 1 > ... > 15/11/24 14:09:40 INFO ql.Driver: MapReduce Jobs Launched: > 15/11/24 14:09:40 INFO ql.Driver: Stage-Stage-1: Map: 1 Reduce: 1 > Cumulative CPU: 3.58 sec HDFS Read: 52956 HDFS Write: 4 SUCCESS > 15/11/24 14:09:40 INFO ql.Driver: Total MapReduce CPU Time Spent: 3 seconds > 580 msec > 15/11/24 14:09:40 INFO ql.Driver: OK > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive
[ https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Dere updated HIVE-11878: -- Attachment: HIVE-11878.2.patch Latest patch (HIVE-11878_approach3_with_review_comments1.patch) was not in the correct name format to kick off the pre-commit tests. Re-uploading it as HIVE-11878.2.patch. > ClassNotFoundException can possibly occur if multiple jars are registered > one at a time in Hive > > > Key: HIVE-11878 > URL: https://issues.apache.org/jira/browse/HIVE-11878 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Ratandeep Ratti >Assignee: Ratandeep Ratti > Labels: URLClassLoader > Attachments: HIVE-11878 ClassLoader Issues when Registering > Jars.pptx, HIVE-11878.2.patch, HIVE-11878.patch, HIVE-11878_approach3.patch, > HIVE-11878_approach3_per_session_clasloader.patch, > HIVE-11878_approach3_with_review_comments.patch, > HIVE-11878_approach3_with_review_comments1.patch, HIVE-11878_qtest.patch > > > When we register a jar on the Hive console. Hive creates a fresh URL > classloader which includes the path of the current jar to be registered and > all the jar paths of the parent classloader. The parent classlaoder is the > current ThreadContextClassLoader. Once the URLClassloader is created Hive > sets that as the current ThreadContextClassloader. > So if we register multiple jars in Hive, there will be multiple > URLClassLoaders created, each classloader including the jars from its parent > and the one extra jar to be registered. The last URLClassLoader created will > end up as the current ThreadContextClassLoader. (See details: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath) > Now here's an example in which the above strategy can lead to a CNF exception. > We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class > *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, > the URLClassLoader *u1* is created and also set as the > ThreadContextClassLoader. We register *j2* next, the new URLClassLoader > created will be *u2* with *u1* as parent and *u2* becomes the new > ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* > whereas *u1* only has paths to *j1* (For details see: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath). > Now when we register class *c1* under a temporary function in Hive, we load > the class using {code} class.forName("c1", true, > Thread.currentThread().getContextClassLoader()) {code} . The > currentThreadContext class-loader is *u2*, and it has the path to the class > *c1*, but note that Class-loaders work by delegating to parent class-loader > first. In this case class *c1* will be found and *defined* by class-loader > *u1*. > Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say > initialize) is called in *c1*, which references the class *c2*, *c2* will not > be found since the class-loader used to search for *c2* will be *u1* (Since > the caller's class-loader is used to load a class) > I've added a qtest to explain the problem. Please see the attached patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12514) Setup renewal of LLAP tokens
[ https://issues.apache.org/jira/browse/HIVE-12514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siddharth Seth updated HIVE-12514: -- Assignee: (was: Thejas M Nair) > Setup renewal of LLAP tokens > > > Key: HIVE-12514 > URL: https://issues.apache.org/jira/browse/HIVE-12514 > Project: Hive > Issue Type: Improvement > Components: llap, Security >Affects Versions: 2.0.0 >Reporter: Siddharth Seth > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11890) Create ORC module
[ https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-11890: - Attachment: HIVE-11890.patch This patch is rebased to master. It does: * creates an orc submodule * moves a couple more classes to the storage-api module * moves most of the api and utility classes to the orc module > Create ORC module > - > > Key: HIVE-11890 > URL: https://issues.apache.org/jira/browse/HIVE-11890 > Project: Hive > Issue Type: Sub-task >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, > HIVE-11890.patch > > > Start moving classes over to the ORC module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12020) Revert log4j2 xml configuration to properties based configuration
[ https://issues.apache.org/jira/browse/HIVE-12020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12020: - Attachment: HIVE-12020.2.patch This patch includes changes for llap. Also updated log4j2 version from 2.4 to 2.4.1 as we hit this issue https://issues.apache.org/jira/browse/LOG4J2-1153 > Revert log4j2 xml configuration to properties based configuration > - > > Key: HIVE-12020 > URL: https://issues.apache.org/jira/browse/HIVE-12020 > Project: Hive > Issue Type: Sub-task > Components: Logging >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12020.1.patch, HIVE-12020.2.patch > > > Log4j 2.4 release brought back properties based configuration. We should > revert XML based configuration and use properties based configuration instead > (less verbose and will be similar to old log4j properties). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)
[ https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12341: Attachment: HIVE-12341.02.patch Addressed the RB comments. [~sseth] can you take a look at Tez credentials transfer in particular, does that change make sense? I will set up a test for it later if all should be good. > LLAP: add security to daemon protocol endpoint (excluding shuffle) > -- > > Key: HIVE-12341 > URL: https://issues.apache.org/jira/browse/HIVE-12341 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, > HIVE-12341.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use
[ https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025745#comment-15025745 ] Xuefu Zhang commented on HIVE-12184: Patch looks good. A few minor comments on RB. > DESCRIBE of fully qualified table fails when db and table name match and > non-default database is in use > --- > > Key: HIVE-12184 > URL: https://issues.apache.org/jira/browse/HIVE-12184 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.1 >Reporter: Lenni Kuff >Assignee: Naveen Gangam > Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, > HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.6.patch, > HIVE-12184.7.patch, HIVE-12184.8.patch, HIVE-12184.patch > > > DESCRIBE of fully qualified table fails when db and table name match and > non-default database is in use. > Repro: > {code} > : jdbc:hive2://localhost:1/default> create database foo; > No rows affected (0.116 seconds) > 0: jdbc:hive2://localhost:1/default> create table foo.foo(i int); > 0: jdbc:hive2://localhost:1/default> describe foo.foo; > +---++--+--+ > | col_name | data_type | comment | > +---++--+--+ > | i | int| | > +---++--+--+ > 1 row selected (0.049 seconds) > 0: jdbc:hive2://localhost:1/default> use foo; > 0: jdbc:hive2://localhost:1/default> describe foo.foo; > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from > serde.Invalid Field foo (state=08S01,code=1) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12487) Fix broken MiniLlap tests
[ https://issues.apache.org/jira/browse/HIVE-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025831#comment-15025831 ] Hive QA commented on HIVE-12487: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773865/HIVE-12487.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 9864 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMarkPartition - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_smb_mapjoin_12 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_temp_table_gb1 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union31 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union32 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_15 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_short_regress org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.spark.client.rpc.TestRpc.testClientTimeout {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6118/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6118/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6118/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773865 - PreCommit-HIVE-TRUNK-Build > Fix broken MiniLlap tests > - > > Key: HIVE-12487 > URL: https://issues.apache.org/jira/browse/HIVE-12487 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Aleksei Statkevich >Assignee: Aleksei Statkevich >Priority: Critical > Attachments: HIVE-12487.1.patch, HIVE-12487.2.patch, HIVE-12487.patch > > > Currently MiniLlap tests fail with the following error: > {code} > TestMiniLlapCliDriver - did not produce a TEST-*.xml file > {code} > Supposedly, it started happening after HIVE-12319. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025848#comment-15025848 ] Hive QA commented on HIVE-12483: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773549/HIVE-12483.1-spark.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 9788 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1013/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1013/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1013/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773549 - PreCommit-HIVE-SPARK-Build > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-12483.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11488) Add sessionId and queryId info to HS2 log
[ https://issues.apache.org/jira/browse/HIVE-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025855#comment-15025855 ] Lefty Leverenz commented on HIVE-11488: --- Edit permission is easy to get, you just need a Confluence username: * [About This Wiki -- How to get permission to edit | https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki#AboutThisWiki-Howtogetpermissiontoedit] > Add sessionId and queryId info to HS2 log > - > > Key: HIVE-11488 > URL: https://issues.apache.org/jira/browse/HIVE-11488 > Project: Hive > Issue Type: New Feature > Components: Logging >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11488.2.patch, HIVE-11488.3.patch, HIVE-11488.patch > > > Session is critical for a multi-user system like Hive. Currently Hive doesn't > log seessionId to the log file, which sometimes make debugging and analysis > difficult when multiple activities are going on at the same time and the log > from different sessions are mixed together. > Currently, Hive already has the sessionId saved in SessionState and also > there is another sessionId in SessionHandle (Seems not used and I'm still > looking to understand it). Generally we should have one sessionId from the > beginning in the client side and server side. Seems we have some work on that > side first. > The sessionId then can be added to log4j supported mapped diagnostic context > (MDC) and can be configured to output to log file through the log4j property. > MDC is per thread, so we need to add sessionId to the HS2 main thread and > then it will be inherited by the child threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x
[ https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025973#comment-15025973 ] Lefty Leverenz commented on HIVE-12175: --- Sounds good to me. Adding a TODOC2.0 label. > Upgrade Kryo version to 3.0.x > - > > Key: HIVE-12175 > URL: https://issues.apache.org/jira/browse/HIVE-12175 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.0.0 > > Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, > HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, > HIVE-12175.5.patch, HIVE-12175.6.patch > > > Current version of kryo (2.22) has some issue (refer exception below and in > HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We > need to either replace all occurrences of Arrays.asList() or change the > current StdInstantiatorStrategy. This issue is fixed in later versions and > kryo community recommends using DefaultInstantiatorStrategy with fallback to > StdInstantiatorStrategy. More discussion about this issue is here > https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom > serilization/deserilization class can be provided for Arrays.asList. > Also, kryo 3.0 introduced unsafe based serialization which claims to have > much better performance for certain types of serialization. > Exception: > {code} > Caused by: java.lang.NullPointerException > at java.util.Arrays$ArrayList.size(Arrays.java:2847) > at java.util.AbstractList.add(AbstractList.java:108) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > ... 57 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12175) Upgrade Kryo version to 3.0.x
[ https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-12175: -- Labels: TODOC2.0 (was: ) > Upgrade Kryo version to 3.0.x > - > > Key: HIVE-12175 > URL: https://issues.apache.org/jira/browse/HIVE-12175 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, > HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, > HIVE-12175.5.patch, HIVE-12175.6.patch > > > Current version of kryo (2.22) has some issue (refer exception below and in > HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We > need to either replace all occurrences of Arrays.asList() or change the > current StdInstantiatorStrategy. This issue is fixed in later versions and > kryo community recommends using DefaultInstantiatorStrategy with fallback to > StdInstantiatorStrategy. More discussion about this issue is here > https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom > serilization/deserilization class can be provided for Arrays.asList. > Also, kryo 3.0 introduced unsafe based serialization which claims to have > much better performance for certain types of serialization. > Exception: > {code} > Caused by: java.lang.NullPointerException > at java.util.Arrays$ArrayList.size(Arrays.java:2847) > at java.util.AbstractList.add(AbstractList.java:108) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > ... 57 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
[ https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Zheng updated HIVE-11531: - Attachment: HIVE-11531.02.patch Thanks [~sershe] and [~jcamachorodriguez] I updated the patch. > Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise > - > > Key: HIVE-11531 > URL: https://issues.apache.org/jira/browse/HIVE-11531 > Project: Hive > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Hui Zheng > Attachments: HIVE-11531.02.patch, HIVE-11531.WIP.1.patch, > HIVE-11531.WIP.2.patch, HIVE-11531.patch > > > For any UIs that involve pagination, it is useful to issue queries in the > form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be > paginated (which can be extremely large by itself). At present, ROW_NUMBER > can be used to achieve this effect, but optimizations for LIMIT such as TopN > in ReduceSink do not apply to ROW_NUMBER. We can add first class support for > "skip" to existing limit, or improve ROW_NUMBER for better performance -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x
[ https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025936#comment-15025936 ] Lefty Leverenz commented on HIVE-12175: --- Should [~prasanth_j]'s explanation be documented in the wiki? Or does this need any other documentation? > Upgrade Kryo version to 3.0.x > - > > Key: HIVE-12175 > URL: https://issues.apache.org/jira/browse/HIVE-12175 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.0.0 > > Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, > HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, > HIVE-12175.5.patch, HIVE-12175.6.patch > > > Current version of kryo (2.22) has some issue (refer exception below and in > HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We > need to either replace all occurrences of Arrays.asList() or change the > current StdInstantiatorStrategy. This issue is fixed in later versions and > kryo community recommends using DefaultInstantiatorStrategy with fallback to > StdInstantiatorStrategy. More discussion about this issue is here > https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom > serilization/deserilization class can be provided for Arrays.asList. > Also, kryo 3.0 introduced unsafe based serialization which claims to have > much better performance for certain types of serialization. > Exception: > {code} > Caused by: java.lang.NullPointerException > at java.util.Arrays$ArrayList.size(Arrays.java:2847) > at java.util.AbstractList.add(AbstractList.java:108) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > ... 57 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-1073) CREATE VIEW followup: track view dependency information in metastore
[ https://issues.apache.org/jira/browse/HIVE-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025987#comment-15025987 ] Carl Steinbach commented on HIVE-1073: -- Hi [~freepeter], thanks for writing up these notes! bq. To track the view dependency, I will add a new class MTableDependency (name TBD) which contains srcTbl and dstTbl. Since only views can have dependencies on other tables/views it probably makes sense to change the name to MViewDependency, and replace srcTbl with srcView. > CREATE VIEW followup: track view dependency information in metastore > - > > Key: HIVE-1073 > URL: https://issues.apache.org/jira/browse/HIVE-1073 > Project: Hive > Issue Type: Improvement > Components: Metastore, Views >Affects Versions: 0.6.0 >Reporter: John Sichi >Assignee: Wenlei Xie > > Add a generic mechanism for recording the fact that one object depends on > another. First use case (to be implemented as part of this task) would be > views depending on tables or other views, but in the future we can also use > this for views depending on persistent functions, functions depending on > other functions, etc. > This involves metastore modeling, QL analysis for deriving and recording the > dependencies (Ashish says something may already be available from the lineage > work), and CLI support for browsing dependencies. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12466) SparkCounter not initialized error
[ https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025999#comment-15025999 ] Chengxiang Li commented on HIVE-12466: -- SparkCounters is only used for stats collection now, so yes, i think we may not need SparkCounters anymore if counter-based stats collection is removed. As far as i know, there is no other Hive features which depends on SparkCounters. > SparkCounter not initialized error > -- > > Key: HIVE-12466 > URL: https://issues.apache.org/jira/browse/HIVE-12466 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-12466.1-spark.patch > > > During a query, lots of the following error found in executor's log: > {noformat} > 03:47:28.759 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:28.762 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:30.707 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.tmp_tmp] has not initialized before. > 03:47:33.385 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.388 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.495 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:35.141 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12411) Remove counter based stats collection mechanism
[ https://issues.apache.org/jira/browse/HIVE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-12411: -- Labels: TODOC2.0 (was: ) > Remove counter based stats collection mechanism > --- > > Key: HIVE-12411 > URL: https://issues.apache.org/jira/browse/HIVE-12411 > Project: Hive > Issue Type: Task > Components: Statistics >Affects Versions: 1.2.0, 1.2.1 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12411.01.patch, HIVE-12411.02.patch > > > Following HIVE-12005, HIVE-12164, we have removed jdbc and hbase stats > collection mechanism. Now we are targeting counter based stats collection > mechanism. The main advantages are as follows (1) counter based stats has > limitation on the length of the counter itself, if it is too long, MD5 will > be applied. (2) when there are a large number of partitions and columns, we > need to create a large number of counters in memory. This will put a heavy > load on the M/R AM or Tez AM etc. FS based stats will do a better job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12411) Remove counter based stats collection mechanism
[ https://issues.apache.org/jira/browse/HIVE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025967#comment-15025967 ] Lefty Leverenz commented on HIVE-12411: --- Doc note: This changes *hive.stats.dbclass* (removing counter as a value) and removes *hive.stats.key.prefix.reserve.length* so the wiki needs to be updated for release 2.0.0. * [Configuration Properties -- hive.stats.dbclass | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.dbclass] * [Configuration Properties -- hive.stats.key.prefix.reserve.length | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.key.prefix.reserve.length] The Statistics doc does not mention counter-based stats so no update is required, although an explanation of collection mechanisms would be a helpful addition. *hive.stats.dbclass* is discussed in the Usage section. * [Statistics in Hive | https://cwiki.apache.org/confluence/display/Hive/StatsDev] ** [Implementation | https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-Implementation] ** [Usage | https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-Usage] > Remove counter based stats collection mechanism > --- > > Key: HIVE-12411 > URL: https://issues.apache.org/jira/browse/HIVE-12411 > Project: Hive > Issue Type: Task > Components: Statistics >Affects Versions: 1.2.0, 1.2.1 >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12411.01.patch, HIVE-12411.02.patch > > > Following HIVE-12005, HIVE-12164, we have removed jdbc and hbase stats > collection mechanism. Now we are targeting counter based stats collection > mechanism. The main advantages are as follows (1) counter based stats has > limitation on the length of the counter itself, if it is too long, MD5 will > be applied. (2) when there are a large number of partitions and columns, we > need to create a large number of counters in memory. This will put a heavy > load on the M/R AM or Tez AM etc. FS based stats will do a better job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x
[ https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025953#comment-15025953 ] Prasanth Jayachandran commented on HIVE-12175: -- May be we should list all the custom serializers that hive uses in the documentation and provide note to user saying if any other serializer is required at runtime then runtime exception might be thrown on failure of object creation. > Upgrade Kryo version to 3.0.x > - > > Key: HIVE-12175 > URL: https://issues.apache.org/jira/browse/HIVE-12175 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.0.0 > > Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, > HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, > HIVE-12175.5.patch, HIVE-12175.6.patch > > > Current version of kryo (2.22) has some issue (refer exception below and in > HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We > need to either replace all occurrences of Arrays.asList() or change the > current StdInstantiatorStrategy. This issue is fixed in later versions and > kryo community recommends using DefaultInstantiatorStrategy with fallback to > StdInstantiatorStrategy. More discussion about this issue is here > https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom > serilization/deserilization class can be provided for Arrays.asList. > Also, kryo 3.0 introduced unsafe based serialization which claims to have > much better performance for certain types of serialization. > Exception: > {code} > Caused by: java.lang.NullPointerException > at java.util.Arrays$ArrayList.size(Arrays.java:2847) > at java.util.AbstractList.add(AbstractList.java:108) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > ... 57 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-4240) optimize hive.enforce.bucketing and hive.enforce sorting insert
[ https://issues.apache.org/jira/browse/HIVE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-4240: - Labels: TODOC11 (was: ) > optimize hive.enforce.bucketing and hive.enforce sorting insert > --- > > Key: HIVE-4240 > URL: https://issues.apache.org/jira/browse/HIVE-4240 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Namit Jain >Assignee: Namit Jain > Labels: TODOC11 > Fix For: 0.11.0 > > Attachments: hive.4240.1.patch, hive.4240.2.patch, hive.4240.3.patch, > hive.4240.4.patch, hive.4240.5.patch, hive.4240.5.patch-nohcat > > > Consider the following scenario: > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > set hive.input.format = > org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.enforce.bucketing=true; > set hive.enforce.sorting=true; > set hive.exec.reducers.max = 1; > set hive.merge.mapfiles=false; > set hive.merge.mapredfiles=false; > -- Create two bucketed and sorted tables > CREATE TABLE test_table1 (key INT, value STRING) PARTITIONED BY (ds STRING) > CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; > CREATE TABLE test_table2 (key INT, value STRING) PARTITIONED BY (ds STRING) > CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; > FROM src > INSERT OVERWRITE TABLE test_table1 PARTITION (ds = '1') SELECT *; > -- Insert data into the bucketed table by selecting from another bucketed > table > -- This should be a map-only operation > INSERT OVERWRITE TABLE test_table2 PARTITION (ds = '1') > SELECT a.key, a.value FROM test_table1 a WHERE a.ds = '1'; > We should not need a reducer to perform the above operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11107) Support for Performance regression test suite with TPCDS
[ https://issues.apache.org/jira/browse/HIVE-11107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sankar Sivarama Subramaniyan updated HIVE-11107: - Attachment: HIVE-11107.3.patch > Support for Performance regression test suite with TPCDS > > > Key: HIVE-11107 > URL: https://issues.apache.org/jira/browse/HIVE-11107 > Project: Hive > Issue Type: Task >Reporter: Hari Sankar Sivarama Subramaniyan >Assignee: Hari Sankar Sivarama Subramaniyan > Attachments: HIVE-11107.1.patch, HIVE-11107.2.patch, > HIVE-11107.3.patch > > > Support to add TPCDS queries to the performance regression test suite with > Hive CBO turned on. > This benchmark is intended to make sure that subsequent changes to the > optimizer or any hive code do not yield any unexpected plan changes. i.e. > the intention is to not run the entire TPCDS query set, but just "explain > plan" for the TPCDS queries. > As part of this jira, we will manually verify that expected hive > optimizations kick in for the queries (for given stats/dataset). If there is > a difference in plan within this test suite due to a future commit, it needs > to be analyzed and we need to make sure that it is not a regression. > The test suite can be run in master branch from itests by > {code} > mvn test -Dtest=TestPerfCliDriver -Phadoop-2 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12491) Column Statistics: 3 attribute join on a 2-source table is off
[ https://issues.apache.org/jira/browse/HIVE-12491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025977#comment-15025977 ] Ashutosh Chauhan commented on HIVE-12491: - I guess what Gopal is pointing out is multiple PK case is missing which might help this use case. (as demonstrated in his WIP patch). Other thing is we failed to recognize that out of 3 columns, two are different udfs on same column, so we incorrectly computed denom for that. Ideally, we need to fix both but doing atleast one of these two will help. > Column Statistics: 3 attribute join on a 2-source table is off > -- > > Key: HIVE-12491 > URL: https://issues.apache.org/jira/browse/HIVE-12491 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Prasanth Jayachandran > Attachments: HIVE-12491.WIP.patch > > > The eased out denominator has to detect duplicate row-stats from different > attributes. > {code} > private Long getEasedOutDenominator(List distinctVals) { > // Exponential back-off for NDVs. > // 1) Descending order sort of NDVs > // 2) denominator = NDV1 * (NDV2 ^ (1/2)) * (NDV3 ^ (1/4))) * > Collections.sort(distinctVals, Collections.reverseOrder()); > long denom = distinctVals.get(0); > for (int i = 1; i < distinctVals.size(); i++) { > denom = (long) (denom * Math.pow(distinctVals.get(i), 1.0 / (1 << > i))); > } > return denom; > } > {code} > This gets {{[8007986, 821974390, 821974390]}}, which is actually 3 columns 2 > of which are derived from the same column. > {code} > Reduce Output Operator (RS_12) > key expressions: _col0 (type: bigint), year(_col2) (type: int), > month(_col2) (type: int) > sort order: +++ > Map-reduce partition columns: _col0 (type: bigint), year(_col2) > (type: int), month(_col2) (type: int) > value expressions: _col1 (type: bigint) > Join Operator (JOIN_13) > condition map: > Inner Join 0 to 1 > keys: > 0 _col0 (type: bigint), year(_col1) (type: int), month(_col1) > (type: int) > 1 _col0 (type: bigint), year(_col2) (type: int), month(_col2) > (type: int) > outputColumnNames: _col3 > {code} > So the eased out denominator is off by a factor of 30,000 or so, causing OOMs > in map-joins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12329) Turn on limit pushdown optimization by default
[ https://issues.apache.org/jira/browse/HIVE-12329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025926#comment-15025926 ] Prasanth Jayachandran commented on HIVE-12329: -- cp_sel.q.out - I am guessing order is not guaranteed for limit pushdown? and that's why the change insert_into3.q.out - Any idea why a new map task is introduced? Other than these LGTM, +1 > Turn on limit pushdown optimization by default > -- > > Key: HIVE-12329 > URL: https://issues.apache.org/jira/browse/HIVE-12329 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12329.2.patch, HIVE-12329.patch > > > Whenever applicable, this will always help, so this should be on by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12491) Column Statistics: 3 attribute join on a 2-source table is off
[ https://issues.apache.org/jira/browse/HIVE-12491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025956#comment-15025956 ] Pengcheng Xiong commented on HIVE-12491: PK-FK inference in StatsRuleProcFactory is not limited to a single PK and a single FK. It is limited to a single PK only. That is, we allow single PK and multiple FKs. In a single PK and multiple FKs case, we first use PK-FK relationship to estimate the row count, NDV, etc and then join with other FKs without PK-FK inference. Hope it helps. > Column Statistics: 3 attribute join on a 2-source table is off > -- > > Key: HIVE-12491 > URL: https://issues.apache.org/jira/browse/HIVE-12491 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Gopal V >Assignee: Prasanth Jayachandran > Attachments: HIVE-12491.WIP.patch > > > The eased out denominator has to detect duplicate row-stats from different > attributes. > {code} > private Long getEasedOutDenominator(List distinctVals) { > // Exponential back-off for NDVs. > // 1) Descending order sort of NDVs > // 2) denominator = NDV1 * (NDV2 ^ (1/2)) * (NDV3 ^ (1/4))) * > Collections.sort(distinctVals, Collections.reverseOrder()); > long denom = distinctVals.get(0); > for (int i = 1; i < distinctVals.size(); i++) { > denom = (long) (denom * Math.pow(distinctVals.get(i), 1.0 / (1 << > i))); > } > return denom; > } > {code} > This gets {{[8007986, 821974390, 821974390]}}, which is actually 3 columns 2 > of which are derived from the same column. > {code} > Reduce Output Operator (RS_12) > key expressions: _col0 (type: bigint), year(_col2) (type: int), > month(_col2) (type: int) > sort order: +++ > Map-reduce partition columns: _col0 (type: bigint), year(_col2) > (type: int), month(_col2) (type: int) > value expressions: _col1 (type: bigint) > Join Operator (JOIN_13) > condition map: > Inner Join 0 to 1 > keys: > 0 _col0 (type: bigint), year(_col1) (type: int), month(_col1) > (type: int) > 1 _col0 (type: bigint), year(_col2) (type: int), month(_col2) > (type: int) > outputColumnNames: _col3 > {code} > So the eased out denominator is off by a factor of 30,000 or so, causing OOMs > in map-joins. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
[ https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026010#comment-15026010 ] Hui Zheng commented on HIVE-11531: -- Thanks [~jcamachorodriguez] I have implemented {code} LIMIT n OFFSET skip {code} > Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise > - > > Key: HIVE-11531 > URL: https://issues.apache.org/jira/browse/HIVE-11531 > Project: Hive > Issue Type: Improvement >Reporter: Sergey Shelukhin >Assignee: Hui Zheng > Attachments: HIVE-11531.02.patch, HIVE-11531.WIP.1.patch, > HIVE-11531.WIP.2.patch, HIVE-11531.patch > > > For any UIs that involve pagination, it is useful to issue queries in the > form SELECT ... LIMIT X,Y where X,Y are coordinates inside the result to be > paginated (which can be extremely large by itself). At present, ROW_NUMBER > can be used to achieve this effect, but optimizations for LIMIT such as TopN > in ReduceSink do not apply to ROW_NUMBER. We can add first class support for > "skip" to existing limit, or improve ROW_NUMBER for better performance -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12466) SparkCounter not initialized error
[ https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026017#comment-15026017 ] Rui Li commented on HIVE-12466: --- If spark counter is removed, does HoS support other methods to collect stats, like fs-based? > SparkCounter not initialized error > -- > > Key: HIVE-12466 > URL: https://issues.apache.org/jira/browse/HIVE-12466 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-12466.1-spark.patch > > > During a query, lots of the following error found in executor's log: > {noformat} > 03:47:28.759 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:28.762 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:30.707 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.tmp_tmp] has not initialized before. > 03:47:33.385 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.388 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.495 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:35.141 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-4240) optimize hive.enforce.bucketing and hive.enforce sorting insert
[ https://issues.apache.org/jira/browse/HIVE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026021#comment-15026021 ] Lefty Leverenz commented on HIVE-4240: -- Doc note: This added configuration parameter *hive.optimize.bucketingsorting* to HiveConf.java, so it needs to be documented in the wiki. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] HIVE-12331 changes the description of *hive.optimize.bucketingsorting* in release 2.0.0. > optimize hive.enforce.bucketing and hive.enforce sorting insert > --- > > Key: HIVE-4240 > URL: https://issues.apache.org/jira/browse/HIVE-4240 > Project: Hive > Issue Type: Improvement > Components: Query Processor >Reporter: Namit Jain >Assignee: Namit Jain > Labels: TODOC11 > Fix For: 0.11.0 > > Attachments: hive.4240.1.patch, hive.4240.2.patch, hive.4240.3.patch, > hive.4240.4.patch, hive.4240.5.patch, hive.4240.5.patch-nohcat > > > Consider the following scenario: > set hive.optimize.bucketmapjoin = true; > set hive.optimize.bucketmapjoin.sortedmerge = true; > set hive.input.format = > org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; > set hive.enforce.bucketing=true; > set hive.enforce.sorting=true; > set hive.exec.reducers.max = 1; > set hive.merge.mapfiles=false; > set hive.merge.mapredfiles=false; > -- Create two bucketed and sorted tables > CREATE TABLE test_table1 (key INT, value STRING) PARTITIONED BY (ds STRING) > CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; > CREATE TABLE test_table2 (key INT, value STRING) PARTITIONED BY (ds STRING) > CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; > FROM src > INSERT OVERWRITE TABLE test_table1 PARTITION (ds = '1') SELECT *; > -- Insert data into the bucketed table by selecting from another bucketed > table > -- This should be a map-only operation > INSERT OVERWRITE TABLE test_table2 PARTITION (ds = '1') > SELECT a.key, a.value FROM test_table1 a WHERE a.ds = '1'; > We should not need a reducer to perform the above operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12331) Remove hive.enforce.bucketing & hive.enforce.sorting configs
[ https://issues.apache.org/jira/browse/HIVE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-12331: -- Labels: TODOC2.0 (was: ) > Remove hive.enforce.bucketing & hive.enforce.sorting configs > > > Key: HIVE-12331 > URL: https://issues.apache.org/jira/browse/HIVE-12331 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12331.1.patch, HIVE-12331.patch > > > If table is created as bucketed and/or sorted and this config is set to > false, you will insert data in wrong buckets and/or sort order and then if > you use these tables subsequently in BMJ or SMBJ you will get wrong results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12466) SparkCounter not initialized error
[ https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026034#comment-15026034 ] Chengxiang Li commented on HIVE-12466: -- Yes, it does, at least at the time i implemented the counter-based stats collection for Spark, it does not relate to any part of our work on HoS, so i assume it should work just as well now. > SparkCounter not initialized error > -- > > Key: HIVE-12466 > URL: https://issues.apache.org/jira/browse/HIVE-12466 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-12466.1-spark.patch > > > During a query, lots of the following error found in executor's log: > {noformat} > 03:47:28.759 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:28.762 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:30.707 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.tmp_tmp] has not initialized before. > 03:47:33.385 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.388 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.495 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:35.141 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12466) SparkCounter not initialized error
[ https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026042#comment-15026042 ] Xuefu Zhang commented on HIVE-12466: Thanks, guys. Let's get this in and uses a separate JIRA to do the cleanup. > SparkCounter not initialized error > -- > > Key: HIVE-12466 > URL: https://issues.apache.org/jira/browse/HIVE-12466 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-12466.1-spark.patch > > > During a query, lots of the following error found in executor's log: > {noformat} > 03:47:28.759 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:28.762 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:30.707 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.tmp_tmp] has not initialized before. > 03:47:33.385 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.388 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.495 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:35.141 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12331) Remove hive.enforce.bucketing & hive.enforce.sorting configs
[ https://issues.apache.org/jira/browse/HIVE-12331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026048#comment-15026048 ] Lefty Leverenz commented on HIVE-12331: --- Doc note: This removes *hive.enforce.bucketing* & *hive.enforce.sorting* from HiveConf.java, and changes the description of *hive.optimize.bucketingsorting* (created by HIVE-4240 in release 0.11.0 and not documented yet in the wiki) so Configuration Properties needs to be updated. * [Configuration Properties -- Query and DDL Execution | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-QueryandDDLExecution] ** [Configuration Properties -- hive.enforce.bucketing | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.enforce.bucketing] ** [Configuration Properties -- hive.enforce.sorting | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.enforce.sorting] ** [Configuration Properties -- hive.optimize.bucketingsorting | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.optimize.bucketingsorting] (this link will work after the parameter has been documented) Other wikidocs that need updates because they mention the removed *hive.enforce.bucketing* parameter: * [Hive Transactions -- Configuration (annotate *hive.enforce.bucketing* with version information) | https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-Configuration] * [Hive Transactions -- Configuration Values to Set for INSERT, UPDATE, DELETE (annotate *hive.enforce.bucketing* with version information) | https://cwiki.apache.org/confluence/display/Hive/Hive+Transactions#HiveTransactions-ConfigurationValuestoSetforINSERT,UPDATE,DELETE] * [Bucketed Tables (3 instances of *hive.enforce.bucketing*) | https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL+BucketedTables] * [AdminManual Configuration -- Hive Configuration Variables (1 instance of *hive.enforce.bucketing*) | https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-HiveConfigurationVariables] * [Configuration Properties -- Transactions and Compactor (*hive.enforce.bucketing* in list of parameters that need non-default values) | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-TransactionsandCompactor] > Remove hive.enforce.bucketing & hive.enforce.sorting configs > > > Key: HIVE-12331 > URL: https://issues.apache.org/jira/browse/HIVE-12331 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12331.1.patch, HIVE-12331.patch > > > If table is created as bucketed and/or sorted and this config is set to > false, you will insert data in wrong buckets and/or sort order and then if > you use these tables subsequently in BMJ or SMBJ you will get wrong results. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12329) Turn on limit pushdown optimization by default
[ https://issues.apache.org/jira/browse/HIVE-12329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-12329: -- Labels: TODOC2.0 (was: ) > Turn on limit pushdown optimization by default > -- > > Key: HIVE-12329 > URL: https://issues.apache.org/jira/browse/HIVE-12329 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12329.2.patch, HIVE-12329.patch > > > Whenever applicable, this will always help, so this should be on by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12329) Turn on limit pushdown optimization by default
[ https://issues.apache.org/jira/browse/HIVE-12329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026080#comment-15026080 ] Lefty Leverenz commented on HIVE-12329: --- Doc note: This changes the default value and description of *hive.limit.pushdown.memory.usage* which was introduced by HIVE-3562 in release 0.12.0. It needs to be updated in the wiki: * [Configuration Properties -- hive.limit.pushdown.memory.usage | https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.limit.pushdown.memory.usage] *hive.limit.pushdown.memory.usage* is also mentioned in Hive on Spark: Getting Started but doesn't seem to need revision there -- it just shows a recommended value of 0.4: * [Hive on Spark: Getting Started -- Recommended Configuration | https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark:+Getting+Started#HiveonSpark:GettingStarted-RecommendedConfiguration] > Turn on limit pushdown optimization by default > -- > > Key: HIVE-12329 > URL: https://issues.apache.org/jira/browse/HIVE-12329 > Project: Hive > Issue Type: Improvement > Components: Configuration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-12329.2.patch, HIVE-12329.patch > > > Whenever applicable, this will always help, so this should be on by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12466) SparkCounter not initialized error
[ https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026081#comment-15026081 ] Chengxiang Li commented on HIVE-12466: -- Committed to spark branch, thanks Rui for this contribution. > SparkCounter not initialized error > -- > > Key: HIVE-12466 > URL: https://issues.apache.org/jira/browse/HIVE-12466 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-12466.1-spark.patch > > > During a query, lots of the following error found in executor's log: > {noformat} > 03:47:28.759 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:28.762 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:30.707 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.tmp_tmp] has not initialized before. > 03:47:33.385 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.388 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.495 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:35.141 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12466) SparkCounter not initialized error
[ https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026089#comment-15026089 ] Chengxiang Li commented on HIVE-12466: -- HIVE-12515 is created for the following cleanup work. > SparkCounter not initialized error > -- > > Key: HIVE-12466 > URL: https://issues.apache.org/jira/browse/HIVE-12466 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-12466.1-spark.patch > > > During a query, lots of the following error found in executor's log: > {noformat} > 03:47:28.759 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:28.762 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:30.707 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.tmp_tmp] has not initialized before. > 03:47:33.385 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.388 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.495 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:35.141 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12469) Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address vulnerability
[ https://issues.apache.org/jira/browse/HIVE-12469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026121#comment-15026121 ] Hive QA commented on HIVE-12469: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773866/HIVE-12469.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 9827 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6119/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6119/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6119/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773866 - PreCommit-HIVE-TRUNK-Build > Bump Commons-Collections dependency from 3.2.1 to 3.2.2. to address > vulnerability > - > > Key: HIVE-12469 > URL: https://issues.apache.org/jira/browse/HIVE-12469 > Project: Hive > Issue Type: Bug > Components: Build Infrastructure >Reporter: Reuben Kuhnert >Assignee: Reuben Kuhnert >Priority: Blocker > Attachments: HIVE-12469.2.patch, HIVE-12469.patch > > > Currently the commons-collections (3.2.1) library allows for invocation of > arbitrary code through {{InvokerTransformer}}, need to bump the version of > commons-collections from 3.2.1 to 3.2.2 to resolve this issue. > Results of {{mvn dependency:tree}}: > {code} > [INFO] > > [INFO] Building Hive HPL/SQL 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-hplsql --- > [INFO] org.apache.hive:hive-hplsql:jar:2.0.0-SNAPSHOT > [INFO] +- com.google.guava:guava:jar:14.0.1:compile > [INFO] +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Packaging 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.hive:hive-hbase-handler:jar:2.0.0-SNAPSHOT:compile > [INFO] | +- org.apache.hbase:hbase-server:jar:1.1.1:compile > [INFO] | | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {code} > [INFO] > > [INFO] Building Hive Common 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-common --- > [INFO] +- org.apache.hadoop:hadoop-common:jar:2.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} > {{Hadoop-Common}} dependency also found in: LLAP, Serde, Storage, Shims, > Shims Common, Shims Scheduler) > {code} > [INFO] > > [INFO] Building Hive Ant Utilities 2.0.0-SNAPSHOT > [INFO] > > [INFO] > [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ hive-ant --- > [INFO] | +- commons-collections:commons-collections:jar:3.1:compile > {code} > {code} > [INFO] > > [INFO] > > [INFO] Building Hive Accumulo Handler 2.0.0-SNAPSHOT > [INFO] > > [INFO] +- org.apache.accumulo:accumulo-core:jar:1.6.0:compile > [INFO] | +- commons-collections:commons-collections:jar:3.2.1:compile > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12487) Fix broken MiniLlap tests
[ https://issues.apache.org/jira/browse/HIVE-12487?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026128#comment-15026128 ] Aleksei Statkevich commented on HIVE-12487: --- TestMiniLlapCliDriver tests pass fine now. Spark tests pass fine for me locally. Error during test run seem to be unrelated: {code} Unexpected exception java.lang.IllegalStateException: Error trying to obtain executor info: java.lang.IllegalStateException: RPC channel is closed. at org.apache.hadoop.hive.ql.QTestUtil$1.setSparkSession(QTestUtil.java:1022)} {code} > Fix broken MiniLlap tests > - > > Key: HIVE-12487 > URL: https://issues.apache.org/jira/browse/HIVE-12487 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Aleksei Statkevich >Assignee: Aleksei Statkevich >Priority: Critical > Attachments: HIVE-12487.1.patch, HIVE-12487.2.patch, HIVE-12487.patch > > > Currently MiniLlap tests fail with the following error: > {code} > TestMiniLlapCliDriver - did not produce a TEST-*.xml file > {code} > Supposedly, it started happening after HIVE-12319. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11927) Implement/Enable constant related optimization rules in Calcite: enable HiveReduceExpressionsRule to fold constants
[ https://issues.apache.org/jira/browse/HIVE-11927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026134#comment-15026134 ] Laljo John Pullokkaran commented on HIVE-11927: --- [~pxiong] Now that Calcite 1.5 is released lets update the patch. We should just provide the executor and reuse Calcite reduction rules. > Implement/Enable constant related optimization rules in Calcite: enable > HiveReduceExpressionsRule to fold constants > --- > > Key: HIVE-11927 > URL: https://issues.apache.org/jira/browse/HIVE-11927 > Project: Hive > Issue Type: Sub-task >Reporter: Pengcheng Xiong >Assignee: Pengcheng Xiong > Attachments: HIVE-11927.01.patch, HIVE-11927.02.patch, > HIVE-11927.03.patch, HIVE-11927.04.patch, HIVE-11927.05.patch, > HIVE-11927.06.patch, HIVE-11927.07.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-6113) Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
[ https://issues.apache.org/jira/browse/HIVE-6113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026135#comment-15026135 ] Hive QA commented on HIVE-6113: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773890/HIVE-6113.4.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6120/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6120/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6120/ Messages: {noformat} This message was trimmed, see log for full details main: [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hive-it-util --- [INFO] Compiling 51 source files to /data/hive-ptest/working/apache-github-source-source/itests/util/target/classes [WARNING] /data/hive-ptest/working/apache-github-source-source/itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java: Some input files use or override a deprecated API. [WARNING] /data/hive-ptest/working/apache-github-source-source/itests/util/src/main/java/org/apache/hadoop/hive/hbase/HBaseQTestUtil.java: Recompile with -Xlint:deprecation for details. [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hive-it-util --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory /data/hive-ptest/working/apache-github-source-source/itests/util/src/test/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-antrun-plugin:1.7:run (setup-test-dirs) @ hive-it-util --- [INFO] Executing tasks main: [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/itests/util/target/tmp [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/itests/util/target/warehouse [mkdir] Created dir: /data/hive-ptest/working/apache-github-source-source/itests/util/target/tmp/conf [copy] Copying 14 files to /data/hive-ptest/working/apache-github-source-source/itests/util/target/tmp/conf [INFO] Executed tasks [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hive-it-util --- [INFO] No sources to compile [INFO] [INFO] --- maven-surefire-plugin:2.16:test (default-test) @ hive-it-util --- [INFO] Tests are skipped. [INFO] [INFO] --- maven-jar-plugin:2.2:jar (default-jar) @ hive-it-util --- [INFO] Building jar: /data/hive-ptest/working/apache-github-source-source/itests/util/target/hive-it-util-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ hive-it-util --- [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hive-it-util --- [INFO] Installing /data/hive-ptest/working/apache-github-source-source/itests/util/target/hive-it-util-2.0.0-SNAPSHOT.jar to /data/hive-ptest/working/maven/org/apache/hive/hive-it-util/2.0.0-SNAPSHOT/hive-it-util-2.0.0-SNAPSHOT.jar [INFO] Installing /data/hive-ptest/working/apache-github-source-source/itests/util/pom.xml to /data/hive-ptest/working/maven/org/apache/hive/hive-it-util/2.0.0-SNAPSHOT/hive-it-util-2.0.0-SNAPSHOT.pom [INFO] [INFO] [INFO] Building Hive Integration - Unit Tests 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hive-it-unit --- [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/itests/hive-unit/target [INFO] Deleting /data/hive-ptest/working/apache-github-source-source/itests/hive-unit (includes = [datanucleus.log, derby.log], excludes = []) [INFO] [INFO] --- maven-enforcer-plugin:1.3.1:enforce (enforce-no-snapshots) @ hive-it-unit --- [INFO] [INFO] --- maven-antrun-plugin:1.7:run (download-spark) @ hive-it-unit --- [INFO] Executing tasks main: [exec] + /bin/pwd [exec] /data/hive-ptest/working/apache-github-source-source/itests/hive-unit [exec] + BASE_DIR=./target [exec] + HIVE_ROOT=./target/../../../ [exec] + DOWNLOAD_DIR=./../thirdparty [exec] + mkdir -p ./../thirdparty [exec] + download http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.5.0-bin-hadoop2-without-hive.tgz spark [exec] + url=http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.5.0-bin-hadoop2-without-hive.tgz [exec] + finalName=spark [exec] ++ basename http://d3jw87u4immizc.cloudfront.net/spark-tarball/spark-1.5.0-bin-hadoop2-without-hive.tgz [exec] + tarName=spark-1.5.0-bin-hadoop2-without-hive.tgz [exec] + rm -rf ./target/spark [exec]
[jira] [Updated] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use
[ https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-12184: - Attachment: HIVE-12184.9.patch > DESCRIBE of fully qualified table fails when db and table name match and > non-default database is in use > --- > > Key: HIVE-12184 > URL: https://issues.apache.org/jira/browse/HIVE-12184 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.1 >Reporter: Lenni Kuff >Assignee: Naveen Gangam > Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, > HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.6.patch, > HIVE-12184.7.patch, HIVE-12184.8.patch, HIVE-12184.9.patch, HIVE-12184.patch > > > DESCRIBE of fully qualified table fails when db and table name match and > non-default database is in use. > Repro: > {code} > : jdbc:hive2://localhost:1/default> create database foo; > No rows affected (0.116 seconds) > 0: jdbc:hive2://localhost:1/default> create table foo.foo(i int); > 0: jdbc:hive2://localhost:1/default> describe foo.foo; > +---++--+--+ > | col_name | data_type | comment | > +---++--+--+ > | i | int| | > +---++--+--+ > 1 row selected (0.049 seconds) > 0: jdbc:hive2://localhost:1/default> use foo; > 0: jdbc:hive2://localhost:1/default> describe foo.foo; > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from > serde.Invalid Field foo (state=08S01,code=1) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12184) DESCRIBE of fully qualified table fails when db and table name match and non-default database is in use
[ https://issues.apache.org/jira/browse/HIVE-12184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026162#comment-15026162 ] Xuefu Zhang commented on HIVE-12184: +1 pending on test. > DESCRIBE of fully qualified table fails when db and table name match and > non-default database is in use > --- > > Key: HIVE-12184 > URL: https://issues.apache.org/jira/browse/HIVE-12184 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.1 >Reporter: Lenni Kuff >Assignee: Naveen Gangam > Attachments: HIVE-12184.2.patch, HIVE-12184.3.patch, > HIVE-12184.4.patch, HIVE-12184.5.patch, HIVE-12184.6.patch, > HIVE-12184.7.patch, HIVE-12184.8.patch, HIVE-12184.9.patch, HIVE-12184.patch > > > DESCRIBE of fully qualified table fails when db and table name match and > non-default database is in use. > Repro: > {code} > : jdbc:hive2://localhost:1/default> create database foo; > No rows affected (0.116 seconds) > 0: jdbc:hive2://localhost:1/default> create table foo.foo(i int); > 0: jdbc:hive2://localhost:1/default> describe foo.foo; > +---++--+--+ > | col_name | data_type | comment | > +---++--+--+ > | i | int| | > +---++--+--+ > 1 row selected (0.049 seconds) > 0: jdbc:hive2://localhost:1/default> use foo; > 0: jdbc:hive2://localhost:1/default> describe foo.foo; > Error: Error while processing statement: FAILED: Execution Error, return code > 1 from org.apache.hadoop.hive.ql.exec.DDLTask. Error in getting fields from > serde.Invalid Field foo (state=08S01,code=1) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path
[ https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-12055: - Target Version/s: 2.0.0 > Create row-by-row shims for the write path > --- > > Key: HIVE-12055 > URL: https://issues.apache.org/jira/browse/HIVE-12055 > Project: Hive > Issue Type: Sub-task >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-12055.patch, HIVE-12055.patch > > > As part of removing the row-by-row writer, we'll need to shim out the higher > level API (OrcSerde and OrcOutputFormat) so that we maintain backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12055) Create row-by-row shims for the write path
[ https://issues.apache.org/jira/browse/HIVE-12055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-12055: - Attachment: HIVE-12055.patch Updated to the current HIVE-11890 patch. Passes all of the ORC unit tests and qfiles. > Create row-by-row shims for the write path > --- > > Key: HIVE-12055 > URL: https://issues.apache.org/jira/browse/HIVE-12055 > Project: Hive > Issue Type: Sub-task >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-12055.patch, HIVE-12055.patch, HIVE-12055.patch > > > As part of removing the row-by-row writer, we'll need to shim out the higher > level API (OrcSerde and OrcOutputFormat) so that we maintain backwards > compatibility. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11890) Create ORC module
[ https://issues.apache.org/jira/browse/HIVE-11890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley updated HIVE-11890: - Target Version/s: 2.0.0 > Create ORC module > - > > Key: HIVE-11890 > URL: https://issues.apache.org/jira/browse/HIVE-11890 > Project: Hive > Issue Type: Sub-task >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Attachments: HIVE-11890.patch, HIVE-11890.patch, HIVE-11890.patch, > HIVE-11890.patch > > > Start moving classes over to the ORC module. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-12483: --- Attachment: (was: HIVE-12483.1-spark.patch) > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12517) Beeline's use of failed connection(s) causes failures and leaks.
[ https://issues.apache.org/jira/browse/HIVE-12517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-12517: - Attachment: HIVE-12517.patch Attaching a patch fix with a proposed fix. Below are results from a test. {code} beeline> !connect jdbc:hive2://localhost:1 hive1 hive1 scan complete in 9ms Connecting to jdbc:hive2://localhost:1 Connected to: Apache Hive (version 2.0.0-SNAPSHOT) Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT) Transaction isolation: TRANSACTION_REPEATABLE_READ 0: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:1 hive1 hive1 Connecting to jdbc:hive2://localhost:1 Connected to: Apache Hive (version 2.0.0-SNAPSHOT) Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT) Transaction isolation: TRANSACTION_REPEATABLE_READ 1: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:1 hive1 hive1 Connecting to jdbc:hive2://localhost:1 Connected to: Apache Hive (version 2.0.0-SNAPSHOT) Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT) Transaction isolation: TRANSACTION_REPEATABLE_READ 2: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:1 hive1 hive1 Connecting to jdbc:hive2://localhost:1 Connected to: Apache Hive (version 2.0.0-SNAPSHOT) Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT) Transaction isolation: TRANSACTION_REPEATABLE_READ 3: jdbc:hive2://localhost:1> !connect jdbc:hive2://localhost:11000 hive1 hive1 Connecting to jdbc:hive2://localhost:11000 Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused (state=08S01,code=0) 3: jdbc:hive2://localhost:1> !tables ++--+-+-+--+--+ | TABLE_CAT | TABLE_SCHEM | TABLE_NAME | TABLE_TYPE | REMARKS | ++--+-+-+--+--+ || default | char_nested_1 | TABLE | NULL | || default | src | TABLE | NULL | || default | char_nested_struct | TABLE | NULL | || default | src_thrift | TABLE | NULL | || default | x | TABLE | NULL | ++--+-+-+--+--+ 3: jdbc:hive2://localhost:1> !list 4 active connections: #0 open jdbc:hive2://localhost:1 #1 open jdbc:hive2://localhost:1 #2 open jdbc:hive2://localhost:1 #3 open jdbc:hive2://localhost:1 3: jdbc:hive2://localhost:1> !closeall Closing: 3: jdbc:hive2://localhost:1 Closing: 2: jdbc:hive2://localhost:1 Closing: 1: jdbc:hive2://localhost:1 Closing: 0: jdbc:hive2://localhost:1 beeline> {code} Also when the first connection attempt is unsuccessful, beeline prompt is current set to {code} beeline> !connect jdbc:hive2://localhost:11000 hive1 hive1 Connecting to jdbc:hive2://localhost:11000 Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused (state=08S01,code=0) 0: jdbc:hive2://localhost:11000 (closed)> {code} With the patch, the prompt is still "beeline>" as below {code} beeline> !connect jdbc:hive2://localhost:11000 hive1 hive1 Connecting to jdbc:hive2://localhost:11000 Error: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:11000: java.net.ConnectException: Connection refused (state=08S01,code=0) beeline> {code} > Beeline's use of failed connection(s) causes failures and leaks. > > > Key: HIVE-12517 > URL: https://issues.apache.org/jira/browse/HIVE-12517 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Naveen Gangam >Assignee: Naveen Gangam >Priority: Minor > Fix For: 2.0.0 > > Attachments: HIVE-12517.patch > > > Beeline adds a bad connection(s) to the connection list and makes it the > current connection, so any subsequent queries will attempt to use this bad > connection and will fail. Even a "!close" would not work. > 1) all queries fail unless !go is used. > 2) !closeall cannot close the active connections either. > 3) !exit will exit while attempting to establish these inactive connections > without closing the active connections. So this could hold up server side > resources. > {code} > beeline> !connect jdbc:hive2://localhost:1 hive1 hive1 > scan complete in 8ms > Connecting to jdbc:hive2://localhost:1 > Connected to: Apache Hive (version 2.0.0-SNAPSHOT) > Driver: Hive JDBC (version 1.1.0-cdh5.7.0-SNAPSHOT) > Transaction isolation: TRANSACTION_REPEATABLE_READ > 0: jdbc:hive2://localhost:1> !connect
[jira] [Commented] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026245#comment-15026245 ] Sergio Peña commented on HIVE-12483: [~xuefuz] schemeAuthority and schemeAuthority2 are passing now. I had to update the the ptest server running in the spark master instance to make it work. There was a race condition causing the errors, but it was solved after the update. Are the other failing tests passing in your local branch as well? > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12498) ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect
[ https://issues.apache.org/jira/browse/HIVE-12498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-12498: - Attachment: HIVE-12498.2.patch Fixed test case to close file and use different file name. > ACID: Setting OrcRecordUpdater.OrcOptions.tableProperties() has no effect > - > > Key: HIVE-12498 > URL: https://issues.apache.org/jira/browse/HIVE-12498 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Labels: ACID, ORC > Attachments: HIVE-12498.1.patch, HIVE-12498.2.patch > > > OrcRecordUpdater does not honor the > OrcRecordUpdater.OrcOptions.tableProperties() setting. > It would need to translate the specified tableProperties (as listed in > OrcTableProperties enum) to the properties that OrcWriter internally > understands (listed in HiveConf.ConfVars). > This is needed for multiple clients.. like Streaming API and Compactor. > {code:java} > Properties orcTblProps = .. // get Orc Table Properties from MetaStore; > AcidOutputFormat.Options updaterOptions = new > OrcRecordUpdater.OrcOptions(conf) > .inspector(..) > .bucket(..) > .minimumTransactionId(..) > .maximumTransactionId(..) > > .tableProperties(orcTblProps); // <<== > OrcOutputFormat orcOutput = new ... > orcOutput.getRecordUpdater(partitionPath, updaterOptions ); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12413) Default mode for hive.mapred.mode should be strict
[ https://issues.apache.org/jira/browse/HIVE-12413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-12413: Attachment: HIVE-12413.3.patch > Default mode for hive.mapred.mode should be strict > -- > > Key: HIVE-12413 > URL: https://issues.apache.org/jira/browse/HIVE-12413 > Project: Hive > Issue Type: Task > Components: Configuration >Reporter: Ashutosh Chauhan >Assignee: Ashutosh Chauhan > Attachments: HIVE-12413.1.patch, HIVE-12413.2.patch, > HIVE-12413.3.patch, HIVE-12413.patch > > > Non-strict mode allows some questionable semantics and questionable > operations. Its better that user makes a conscious choice to enable such a > behavior. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12508) HiveAggregateJoinTransposeRule places a heavy load on the metadata system
[ https://issues.apache.org/jira/browse/HIVE-12508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025601#comment-15025601 ] Jesus Camacho Rodriguez commented on HIVE-12508: The issue was (if I recall well) that you end up with cycles in the planning graph (between equivalent sets of expressions) and then a metadata provider can fire up indefinitely. But I guess that as currently we execute this rule in isolation in Hive, and we know this rule will not produce cycles, we could close it as the metadata provider will never fire up indefinitely. > HiveAggregateJoinTransposeRule places a heavy load on the metadata system > - > > Key: HIVE-12508 > URL: https://issues.apache.org/jira/browse/HIVE-12508 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12508.patch > > > Finding out whether the input is already unique requires a call to > areColumnsUnique that currently (until CALCITE-794 is fixed) places a heavy > load on the metadata system. This can lead to long CBO planning. > This is a temporary fix that avoid the call to the method till then. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12511) IN clause performs differently then = clause
[ https://issues.apache.org/jira/browse/HIVE-12511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025579#comment-15025579 ] Jimmy Xiang commented on HIVE-12511: I think we should fix GenericUDFIn to use common type for comparison instead of generic common type. In this case, for common type of int and string, we should use int instead of string. > IN clause performs differently then = clause > > > Key: HIVE-12511 > URL: https://issues.apache.org/jira/browse/HIVE-12511 > Project: Hive > Issue Type: Bug >Reporter: Jimmy Xiang >Assignee: Jimmy Xiang > > Similar to HIVE-11973, IN clause performs differently then = clause for "int" > type with string values. > For example, > {noformat} > SELECT * FROM inttest WHERE iValue IN ('01'); > {noformat} > will not return any rows with int iValue = 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11878) ClassNotFoundException can possibly occur if multiple jars are registered one at a time in Hive
[ https://issues.apache.org/jira/browse/HIVE-11878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025688#comment-15025688 ] Jason Dere commented on HIVE-11878: --- So removing JARs from the session will still require closing the existing classloader and creating a new one (with the specified JARs omitted from the list of URIs), correct? > ClassNotFoundException can possibly occur if multiple jars are registered > one at a time in Hive > > > Key: HIVE-11878 > URL: https://issues.apache.org/jira/browse/HIVE-11878 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.2.1 >Reporter: Ratandeep Ratti >Assignee: Ratandeep Ratti > Labels: URLClassLoader > Attachments: HIVE-11878 ClassLoader Issues when Registering > Jars.pptx, HIVE-11878.patch, HIVE-11878_approach3.patch, > HIVE-11878_approach3_per_session_clasloader.patch, > HIVE-11878_approach3_with_review_comments.patch, > HIVE-11878_approach3_with_review_comments1.patch, HIVE-11878_qtest.patch > > > When we register a jar on the Hive console. Hive creates a fresh URL > classloader which includes the path of the current jar to be registered and > all the jar paths of the parent classloader. The parent classlaoder is the > current ThreadContextClassLoader. Once the URLClassloader is created Hive > sets that as the current ThreadContextClassloader. > So if we register multiple jars in Hive, there will be multiple > URLClassLoaders created, each classloader including the jars from its parent > and the one extra jar to be registered. The last URLClassLoader created will > end up as the current ThreadContextClassLoader. (See details: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath) > Now here's an example in which the above strategy can lead to a CNF exception. > We register 2 jars *j1* and *j2* in Hive console. *j1* contains the UDF class > *c1* and internally relies on class *c2* in jar *j2*. We register *j1* first, > the URLClassLoader *u1* is created and also set as the > ThreadContextClassLoader. We register *j2* next, the new URLClassLoader > created will be *u2* with *u1* as parent and *u2* becomes the new > ThreadContextClassLoader. Note *u2* includes paths to both jars *j1* and *j2* > whereas *u1* only has paths to *j1* (For details see: > org.apache.hadoop.hive.ql.exec.Utilities#addToClassPath). > Now when we register class *c1* under a temporary function in Hive, we load > the class using {code} class.forName("c1", true, > Thread.currentThread().getContextClassLoader()) {code} . The > currentThreadContext class-loader is *u2*, and it has the path to the class > *c1*, but note that Class-loaders work by delegating to parent class-loader > first. In this case class *c1* will be found and *defined* by class-loader > *u1*. > Now *c1* from jar *j1* has *u1* as its class-loader. If a method (say > initialize) is called in *c1*, which references the class *c2*, *c2* will not > be found since the class-loader used to search for *c2* will be *u1* (Since > the caller's class-loader is used to load a class) > I've added a qtest to explain the problem. Please see the attached patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11675) make use of file footer PPD API in ETL strategy or separate strategy
[ https://issues.apache.org/jira/browse/HIVE-11675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025704#comment-15025704 ] Prasanth Jayachandran commented on HIVE-11675: -- Left some comments in RB > make use of file footer PPD API in ETL strategy or separate strategy > > > Key: HIVE-11675 > URL: https://issues.apache.org/jira/browse/HIVE-11675 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-11675.01.patch, HIVE-11675.02.patch, > HIVE-11675.patch > > > Need to take a look at the best flow. It won't be much different if we do > filtering metastore call for each partition. So perhaps we'd need the custom > sync point/batching after all. > Or we can make it opportunistic and not fetch any footers unless it can be > pushed down to metastore or fetched from local cache, that way the only slow > threaded op is directory listings -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12341) LLAP: add security to daemon protocol endpoint (excluding shuffle)
[ https://issues.apache.org/jira/browse/HIVE-12341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-12341: Attachment: HIVE-12341.04.patch Small fix to retry logic > LLAP: add security to daemon protocol endpoint (excluding shuffle) > -- > > Key: HIVE-12341 > URL: https://issues.apache.org/jira/browse/HIVE-12341 > Project: Hive > Issue Type: Bug >Reporter: Sergey Shelukhin >Assignee: Sergey Shelukhin > Attachments: HIVE-12341.01.patch, HIVE-12341.02.patch, > HIVE-12341.03.patch, HIVE-12341.03.patch, HIVE-12341.04.patch, > HIVE-12341.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-12366) Refactor Heartbeater logic for transaction
[ https://issues.apache.org/jira/browse/HIVE-12366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman reassigned HIVE-12366: - Assignee: Eugene Koifman (was: Elias Elmqvist Wulcan) > Refactor Heartbeater logic for transaction > -- > > Key: HIVE-12366 > URL: https://issues.apache.org/jira/browse/HIVE-12366 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Wei Zheng >Assignee: Eugene Koifman > Attachments: HIVE-12366.1.patch, HIVE-12366.2.patch, > HIVE-12366.3.patch, HIVE-12366.4.patch > > > Currently there is a gap between the time locks acquisition and the first > heartbeat being sent out. Normally the gap is negligible, but when it's big > it will cause query fail since the locks are timed out by the time the > heartbeat is sent. > Need to remove this gap. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC
[ https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran resolved HIVE-12510. -- Resolution: Implemented The fix for this is included in .3 version of HIVE-12020 > LLAP: Append attempt id either to thread name or NDC > > > Key: HIVE-12510 > URL: https://issues.apache.org/jira/browse/HIVE-12510 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Currently, in LLAP attempt id gets appended to both thread name and added to > NDC creating long log lines like below > {code} > [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]] > {code} > I think it will be sufficient to add only to NDC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergio Peña updated HIVE-12483: --- Comment: was deleted (was: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773549/HIVE-12483.1-spark.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9788 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1012/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1012/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1012/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773549 - PreCommit-HIVE-SPARK-Build) > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-12483.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC
[ https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025362#comment-15025362 ] Sergey Shelukhin commented on HIVE-12510: - IIRC this is Tez naming convention from way before the NDC > LLAP: Append attempt id either to thread name or NDC > > > Key: HIVE-12510 > URL: https://issues.apache.org/jira/browse/HIVE-12510 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Currently, in LLAP attempt id gets appended to both thread name and added to > NDC creating long log lines like below > {code} > [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]] > {code} > I think it will be sufficient to add only to NDC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12008) Hive queries failing when using count(*) on column in view
[ https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-12008: Attachment: HIVE-12008.5.patch > Hive queries failing when using count(*) on column in view > -- > > Key: HIVE-12008 > URL: https://issues.apache.org/jira/browse/HIVE-12008 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, > HIVE-12008.3.patch, HIVE-12008.4.patch, HIVE-12008.5.patch > > > count(*) on view with get_json_object() UDF and lateral views and unions > fails in the master with error: > 2015-10-27 17:51:33,742 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : java.lang.RuntimeException: Error in configuring > object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 9 more > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) > ... 14 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 17 more > Caused by: java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:147) > ... 22 more > Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > This query works fine in 1.1 version. > The last two qfile unit tests added by HIVE-11384 fail when hive.in.test is > false. It may relate how we handle prunelist for select. When select include > every column in a table, the prunelist for the select is empty. It may cause > issues to calculate its parent's prunelist.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12508) HiveAggregateJoinTransposeRule places a heavy load on the metadata system
[ https://issues.apache.org/jira/browse/HIVE-12508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025489#comment-15025489 ] Laljo John Pullokkaran commented on HIVE-12508: --- [~jcamachorodriguez] Given that HIVE-12503 fixes this we shouldn't be running into this issue any more. > HiveAggregateJoinTransposeRule places a heavy load on the metadata system > - > > Key: HIVE-12508 > URL: https://issues.apache.org/jira/browse/HIVE-12508 > Project: Hive > Issue Type: Bug > Components: CBO >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12508.patch > > > Finding out whether the input is already unique requires a call to > areColumnsUnique that currently (until CALCITE-794 is fixed) places a heavy > load on the metadata system. This can lead to long CBO planning. > This is a temporary fix that avoid the call to the method till then. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12466) SparkCounter not initialized error
[ https://issues.apache.org/jira/browse/HIVE-12466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025212#comment-15025212 ] Xuefu Zhang commented on HIVE-12466: Thanks to Rui/Chengxiang for working on this. I happened to see that counter-based stats gathering is completely removed by HIVE-12411. I'd like to knows its implications. Does it mean that we don't even need SparkCounter at all? Are there any impacts on Spark regarding stats collection with the removal. Thanks. > SparkCounter not initialized error > -- > > Key: HIVE-12466 > URL: https://issues.apache.org/jira/browse/HIVE-12466 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Rui Li >Assignee: Rui Li > Attachments: HIVE-12466.1-spark.patch > > > During a query, lots of the following error found in executor's log: > {noformat} > 03:47:28.759 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:28.762 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, RECORDS_OUT_0] > has not initialized before. > 03:47:30.707 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.tmp_tmp] has not initialized before. > 03:47:33.385 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.388 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:33.495 [Executor task launch worker-0] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > 03:47:35.141 [Executor task launch worker-1] ERROR > org.apache.hive.spark.counter.SparkCounters - counter[HIVE, > RECORDS_OUT_1_default.test_table] has not initialized before. > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present
[ https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025457#comment-15025457 ] Brock Noland commented on HIVE-11977: - [~dossett] Sorry, I just saw this ping! I moved my mail account and had not yet configured my rules appropiately. This patch looks good! Nice work [~sershe] - agreed, it'd be great to see this in 1.x. > Hive should handle an external avro table with zero length files present > > > Key: HIVE-11977 > URL: https://issues.apache.org/jira/browse/HIVE-11977 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers >Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1 >Reporter: Aaron Dossett >Assignee: Aaron Dossett > Fix For: 2.0.0 > > Attachments: HIVE-11977.2.patch, HIVE-11977.patch > > > If a zero length file is in the top level directory housing an external avro > table, all hive queries on the table fail. > This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader > creates a new org.apache.avro.file.DataFileReader and DataFileReader throws > an exception when trying to read an empty file (because the empty file lacks > the magic number marking it as avro). > AvroGenericRecordReader should detect an empty file and then behave > reasonably. > Caused by: java.io.IOException: Not a data file. > at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102) > at org.apache.avro.file.DataFileReader.(DataFileReader.java:97) > at > org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.(AvroGenericRecordReader.java:81) > at > org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51) > at > org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246) > ... 25 more -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11488) Add sessionId and queryId info to HS2 log
[ https://issues.apache.org/jira/browse/HIVE-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aihua Xu updated HIVE-11488: Fix Version/s: 2.0.0 > Add sessionId and queryId info to HS2 log > - > > Key: HIVE-11488 > URL: https://issues.apache.org/jira/browse/HIVE-11488 > Project: Hive > Issue Type: New Feature > Components: Logging >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11488.2.patch, HIVE-11488.3.patch, HIVE-11488.patch > > > Session is critical for a multi-user system like Hive. Currently Hive doesn't > log seessionId to the log file, which sometimes make debugging and analysis > difficult when multiple activities are going on at the same time and the log > from different sessions are mixed together. > Currently, Hive already has the sessionId saved in SessionState and also > there is another sessionId in SessionHandle (Seems not used and I'm still > looking to understand it). Generally we should have one sessionId from the > beginning in the client side and server side. Seems we have some work on that > side first. > The sessionId then can be added to log4j supported mapped diagnostic context > (MDC) and can be configured to output to log file through the log4j property. > MDC is per thread, so we need to add sessionId to the HS2 main thread and > then it will be inherited by the child threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12510) LLAP: Append attempt id either to thread name or NDC
[ https://issues.apache.org/jira/browse/HIVE-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025339#comment-15025339 ] Prasanth Jayachandran commented on HIVE-12510: -- [~sseth]/[~sershe] any reason for attempt id to be added in both places? > LLAP: Append attempt id either to thread name or NDC > > > Key: HIVE-12510 > URL: https://issues.apache.org/jira/browse/HIVE-12510 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > > Currently, in LLAP attempt id gets appended to both thread name and added to > NDC creating long log lines like below > {code} > [TezTaskRunner_attempt_1448393634013_0008_1_03_00_0[attempt_1448393634013_0008_1_03_00_0]] > {code} > I think it will be sufficient to add only to NDC. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12500) JDBC driver not overlaying params supplied via properties object when reading params from ZK
[ https://issues.apache.org/jira/browse/HIVE-12500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vaibhav Gumashta updated HIVE-12500: Summary: JDBC driver not overlaying params supplied via properties object when reading params from ZK (was: JDBC driver not be overlaying params supplied via properties object when reading params from ZK) > JDBC driver not overlaying params supplied via properties object when reading > params from ZK > > > Key: HIVE-12500 > URL: https://issues.apache.org/jira/browse/HIVE-12500 > Project: Hive > Issue Type: Bug > Components: JDBC >Affects Versions: 1.3.0, 2.0.0 >Reporter: Vaibhav Gumashta >Assignee: Vaibhav Gumashta > Attachments: HIVE-12500.1.patch > > > It makes sense to setup the connection info in one place. Right now part of > connection configuration happens in Utils#parseURL and part in the > HiveConnection constructor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12307) Streaming API TransactionBatch.close() must abort any remaining transactions in the batch
[ https://issues.apache.org/jira/browse/HIVE-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025266#comment-15025266 ] Eugene Koifman commented on HIVE-12307: --- bq. I'm +1 on making this package level, but does it do any good to make the class non-private and leave the constructor private? The class is made package level for testing only. Private c'tor ensures that it's only constructed via factory methods as originally implemented. bq. Why did you make the isClosed value volatile? heartbeating is commonly done from separate thread, for example, Storm does it this way. Also, it's not unusual for application clean up logic to come from a different thread (for example calling close() as a form of cancel). So this is volatile to make sure this works properly regardless of how the client is implemented. I didn't try any more general thread safety issues in this patch. Judging by https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest#StreamingDataIngest-Example–Non-secureMode the original intent was to NOT to have multiple threads in a StreamingConnection. It's worthwhile to do a thread safety review but was not my goal here. bq. write() I'll refactor this bq. SerializationError This is was meant to indicate that a particular row is bad. For example missing columns, etc. This gives the client ability to drop this row (or send to dead letter queue) since replaying it won't help. Unfortunately, w/o my changes here the client never sees SerializationError - it gets wrapped in other exceptions. bq. abortImpl() there is https://issues.apache.org/jira/browse/HIVE-12440 for that > Streaming API TransactionBatch.close() must abort any remaining transactions > in the batch > - > > Key: HIVE-12307 > URL: https://issues.apache.org/jira/browse/HIVE-12307 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 0.14.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-12307.patch > > > When the client of TransactionBatch API encounters an error it must close() > the batch and start a new one. This prevents attempts to continue writing to > a file that may damaged in some way. > The close() should ensure to abort the any txns that still remain in the > batch and close (best effort) all the files it's writing to. The batch > should also put itself into a mode where any future ops on this batch fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-11488) Add sessionId and queryId info to HS2 log
[ https://issues.apache.org/jira/browse/HIVE-11488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025393#comment-15025393 ] Aihua Xu commented on HIVE-11488: - I'm wondering who need to edit the doc. I tried to edit but seems I don't have permission to edit the page . > Add sessionId and queryId info to HS2 log > - > > Key: HIVE-11488 > URL: https://issues.apache.org/jira/browse/HIVE-11488 > Project: Hive > Issue Type: New Feature > Components: Logging >Affects Versions: 2.0.0 >Reporter: Aihua Xu >Assignee: Aihua Xu > Labels: TODOC2.0 > Fix For: 2.0.0 > > Attachments: HIVE-11488.2.patch, HIVE-11488.3.patch, HIVE-11488.patch > > > Session is critical for a multi-user system like Hive. Currently Hive doesn't > log seessionId to the log file, which sometimes make debugging and analysis > difficult when multiple activities are going on at the same time and the log > from different sessions are mixed together. > Currently, Hive already has the sessionId saved in SessionState and also > there is another sessionId in SessionHandle (Seems not used and I'm still > looking to understand it). Generally we should have one sessionId from the > beginning in the client side and server side. Seems we have some work on that > side first. > The sessionId then can be added to log4j supported mapped diagnostic context > (MDC) and can be configured to output to log file through the log4j property. > MDC is per thread, so we need to add sessionId to the HS2 main thread and > then it will be inherited by the child threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9599) remove derby, datanucleus and other not related to jdbc client classes from hive-jdbc-standalone.jar
[ https://issues.apache.org/jira/browse/HIVE-9599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025447#comment-15025447 ] Hive QA commented on HIVE-9599: --- Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773859/HIVE-9599.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 9827 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6117/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6117/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6117/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773859 - PreCommit-HIVE-TRUNK-Build > remove derby, datanucleus and other not related to jdbc client classes from > hive-jdbc-standalone.jar > > > Key: HIVE-9599 > URL: https://issues.apache.org/jira/browse/HIVE-9599 > Project: Hive > Issue Type: Improvement > Components: JDBC >Reporter: Alexander Pivovarov >Assignee: Alexander Pivovarov >Priority: Minor > Attachments: HIVE-9599.1.patch, HIVE-9599.2.patch, HIVE-9599.3.patch, > HIVE-9599.3.patch > > > Looks like the following packages (included to hive-jdbc-standalone.jar) are > not used when jdbc client opens jdbc connection and runs queries: > {code} > antlr/ > antlr/actions/cpp/ > antlr/actions/csharp/ > antlr/actions/java/ > antlr/actions/python/ > antlr/ASdebug/ > antlr/build/ > antlr/collections/ > antlr/collections/impl/ > antlr/debug/ > antlr/debug/misc/ > antlr/preprocessor/ > com/google/gson/ > com/google/gson/annotations/ > com/google/gson/internal/ > com/google/gson/internal/bind/ > com/google/gson/reflect/ > com/google/gson/stream/ > com/google/inject/ > com/google/inject/binder/ > com/google/inject/internal/ > com/google/inject/internal/asm/ > com/google/inject/internal/cglib/core/ > com/google/inject/internal/cglib/proxy/ > com/google/inject/internal/cglib/reflect/ > com/google/inject/internal/util/ > com/google/inject/matcher/ > com/google/inject/name/ > com/google/inject/servlet/ > com/google/inject/spi/ > com/google/inject/util/ > com/jamesmurty/utils/ > com/jcraft/jsch/ > com/jcraft/jsch/jce/ > com/jcraft/jsch/jcraft/ > com/jcraft/jsch/jgss/ > com/jolbox/bonecp/ > com/jolbox/bonecp/hooks/ > com/jolbox/bonecp/proxy/ > com/sun/activation/registries/ > com/sun/activation/viewers/ > com/sun/istack/ > com/sun/istack/localization/ > com/sun/istack/logging/ > com/sun/mail/handlers/ > com/sun/mail/iap/ > com/sun/mail/imap/ > com/sun/mail/imap/protocol/ > com/sun/mail/mbox/ > com/sun/mail/pop3/ > com/sun/mail/smtp/ > com/sun/mail/util/ > com/sun/xml/bind/ > com/sun/xml/bind/annotation/ > com/sun/xml/bind/api/ > com/sun/xml/bind/api/impl/ > com/sun/xml/bind/marshaller/ > com/sun/xml/bind/unmarshaller/ > com/sun/xml/bind/util/ > com/sun/xml/bind/v2/ > com/sun/xml/bind/v2/bytecode/ > com/sun/xml/bind/v2/model/annotation/ > com/sun/xml/bind/v2/model/core/ > com/sun/xml/bind/v2/model/impl/ > com/sun/xml/bind/v2/model/nav/ > com/sun/xml/bind/v2/model/runtime/ > com/sun/xml/bind/v2/runtime/ > com/sun/xml/bind/v2/runtime/output/ > com/sun/xml/bind/v2/runtime/property/ > com/sun/xml/bind/v2/runtime/reflect/ > com/sun/xml/bind/v2/runtime/reflect/opt/ > com/sun/xml/bind/v2/runtime/unmarshaller/ > com/sun/xml/bind/v2/schemagen/ > com/sun/xml/bind/v2/schemagen/episode/ > com/sun/xml/bind/v2/schemagen/xmlschema/ > com/sun/xml/bind/v2/util/ > com/sun/xml/txw2/ > com/sun/xml/txw2/annotation/ > com/sun/xml/txw2/output/ > com/thoughtworks/paranamer/ > contribs/mx/ > javax/activation/ > javax/annotation/ > javax/annotation/concurrent/ > javax/annotation/meta/ > javax/annotation/security/ > javax/el/ > javax/inject/ > javax/jdo/ > javax/jdo/annotations/ > javax/jdo/datastore/ > javax/jdo/identity/ > javax/jdo/listener/ > javax/jdo/metadata/ > javax/jdo/spi/ > javax/mail/ > javax/mail/event/ > javax/mail/internet/ >
[jira] [Commented] (HIVE-12338) Add webui to HiveServer2
[ https://issues.apache.org/jira/browse/HIVE-12338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15026337#comment-15026337 ] Hive QA commented on HIVE-12338: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12774132/HIVE-12338.3.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 44 failed/errored test(s), 9822 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_insert_values_nonascii org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_limit_join_transpose org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_fetchwork_table org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_schema_evol_text_mapwork_table org.apache.hadoop.hive.cli.TestHBaseCliDriver.testCliDriver_hbase_queries org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acid_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_part org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_acidvec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_fetchwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_nonvec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_orc_vec_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_fetchwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_schema_evol_text_mapwork_table org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hive.jdbc.TestJdbcWithMiniHS2.testConcurrentStatements org.apache.hive.jdbc.TestJdbcWithMiniHS2.testHttpHeaderSize org.apache.hive.jdbc.TestJdbcWithMiniHS2.testRootScratchDir org.apache.hive.jdbc.TestJdbcWithMiniHS2.testUdfBlackList org.apache.hive.jdbc.TestJdbcWithMiniHS2.testUdfBlackListOverride org.apache.hive.jdbc.TestJdbcWithMiniHS2.testUdfWhiteList org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark org.apache.hive.jdbc.TestNoSaslAuth.org.apache.hive.jdbc.TestNoSaslAuth org.apache.hive.jdbc.TestSSL.testSSLVersion org.apache.hive.jdbc.TestSchedulerQueue.testFairSchedulerPrimaryQueueMapping org.apache.hive.jdbc.TestSchedulerQueue.testFairSchedulerQueueMapping org.apache.hive.jdbc.TestSchedulerQueue.testFairSchedulerSecondaryQueueMapping org.apache.hive.jdbc.TestSchedulerQueue.testQueueMappingCheckDisabled org.apache.hive.jdbc.authorization.TestHS2AuthzContext.org.apache.hive.jdbc.authorization.TestHS2AuthzContext org.apache.hive.jdbc.authorization.TestHS2AuthzSessionContext.org.apache.hive.jdbc.authorization.TestHS2AuthzSessionContext org.apache.hive.jdbc.authorization.TestJdbcWithSQLAuthUDFBlacklist.testBlackListedUdfUsage org.apache.hive.jdbc.miniHS2.TestHiveServer2.org.apache.hive.jdbc.miniHS2.TestHiveServer2 org.apache.hive.jdbc.miniHS2.TestHs2Metrics.testMetrics org.apache.hive.minikdc.TestHs2HooksWithMiniKdc.org.apache.hive.minikdc.TestHs2HooksWithMiniKdc {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6121/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6121/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6121/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 44 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12774132 - PreCommit-HIVE-TRUNK-Build > Add
[jira] [Commented] (HIVE-12307) Streaming API TransactionBatch.close() must abort any remaining transactions in the batch
[ https://issues.apache.org/jira/browse/HIVE-12307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024902#comment-15024902 ] Eugene Koifman commented on HIVE-12307: --- [~alangates] could you review please > Streaming API TransactionBatch.close() must abort any remaining transactions > in the batch > - > > Key: HIVE-12307 > URL: https://issues.apache.org/jira/browse/HIVE-12307 > Project: Hive > Issue Type: Bug > Components: HCatalog, Transactions >Affects Versions: 0.14.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-12307.patch > > > When the client of TransactionBatch API encounters an error it must close() > the batch and start a new one. This prevents attempts to continue writing to > a file that may damaged in some way. > The close() should ensure to abort the any txns that still remain in the > batch and close (best effort) all the files it's writing to. The batch > should also put itself into a mode where any future ops on this batch fail. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12182) ALTER TABLE PARTITION COLUMN does not set partition column comments
[ https://issues.apache.org/jira/browse/HIVE-12182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naveen Gangam updated HIVE-12182: - Attachment: HIVE-12182.3.patch Rebasing the last patch. > ALTER TABLE PARTITION COLUMN does not set partition column comments > --- > > Key: HIVE-12182 > URL: https://issues.apache.org/jira/browse/HIVE-12182 > Project: Hive > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.1 >Reporter: Lenni Kuff >Assignee: Naveen Gangam > Attachments: HIVE-12182.2.patch, HIVE-12182.3.patch, HIVE-12182.patch > > > ALTER TABLE PARTITION COLUMN does not set partition column comments. The > syntax is accepted, but the COMMENT for the column is ignored. > {code} > 0: jdbc:hive2://localhost:1/default> create table part_test(i int comment > 'HELLO') partitioned by (j int comment 'WORLD'); > No rows affected (0.104 seconds) > 0: jdbc:hive2://localhost:1/default> describe part_test; > +--+---+---+--+ > | col_name | data_type |comment| > +--+---+---+--+ > | i| int | HELLO | > | j| int | WORLD | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | | NULL | NULL | > | j| int | WORLD | > +--+---+---+--+ > 7 rows selected (0.109 seconds) > 0: jdbc:hive2://localhost:1/default> alter table part_test partition > column (j int comment 'WIDE'); > No rows affected (0.121 seconds) > 0: jdbc:hive2://localhost:1/default> describe part_test; > +--+---+---+--+ > | col_name | data_type |comment| > +--+---+---+--+ > | i| int | HELLO | > | j| int | | > | | NULL | NULL | > | # Partition Information | NULL | NULL | > | # col_name | data_type | comment | > | | NULL | NULL | > | j| int | | > +--+---+---+--+ > 7 rows selected (0.108 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12008) Hive queries failing when using count(*) on column in view
[ https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025019#comment-15025019 ] Hive QA commented on HIVE-12008: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773842/HIVE-12008.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 17 failed/errored test(s), 9827 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file TestMiniLlapCliDriver - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_subquery_notin_having org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_dynpart_hashjoin_3 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_tez_smb_empty org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union2 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vector_null_projection org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union16 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union2 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union9 org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_view org.apache.hive.jdbc.TestMultiSessionsHS2WithLocalClusterSpark.testSparkQuery org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6116/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/6116/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-6116/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 17 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773842 - PreCommit-HIVE-TRUNK-Build > Hive queries failing when using count(*) on column in view > -- > > Key: HIVE-12008 > URL: https://issues.apache.org/jira/browse/HIVE-12008 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, > HIVE-12008.3.patch, HIVE-12008.4.patch > > > count(*) on view with get_json_object() UDF and lateral views and unions > fails in the master with error: > 2015-10-27 17:51:33,742 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : java.lang.RuntimeException: Error in configuring > object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 9 more > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at
[jira] [Commented] (HIVE-12008) Hive queries failing when using count(*) on column in view
[ https://issues.apache.org/jira/browse/HIVE-12008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025057#comment-15025057 ] Yongzhi Chen commented on HIVE-12008: - Need add the fixes for tez and spark results too. > Hive queries failing when using count(*) on column in view > -- > > Key: HIVE-12008 > URL: https://issues.apache.org/jira/browse/HIVE-12008 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Yongzhi Chen >Assignee: Yongzhi Chen > Attachments: HIVE-12008.1.patch, HIVE-12008.2.patch, > HIVE-12008.3.patch, HIVE-12008.4.patch > > > count(*) on view with get_json_object() UDF and lateral views and unions > fails in the master with error: > 2015-10-27 17:51:33,742 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : java.lang.RuntimeException: Error in configuring > object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:449) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 9 more > Caused by: java.lang.RuntimeException: Error in configuring object > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:109) > at > org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:75) > at > org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) > at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) > ... 14 more > Caused by: java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) > ... 17 more > Caused by: java.lang.RuntimeException: Map operator initialization failed > at > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(ExecMapper.java:147) > ... 22 more > Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1 > at java.util.ArrayList.rangeCheck(ArrayList.java:635) > at java.util.ArrayList.get(ArrayList.java:411) > This query works fine in 1.1 version. > The last two qfile unit tests added by HIVE-11384 fail when hive.in.test is > false. It may relate how we handle prunelist for select. When select include > every column in a table, the prunelist for the select is empty. It may cause > issues to calculate its parent's prunelist.. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x
[ https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025072#comment-15025072 ] Prasanth Jayachandran commented on HIVE-12175: -- This patch is for master branch only. For branch 1.2.1 you can remove the changes remove the lines related to registration of StandardConstant* related classes. I would recommend fixing the issue separately for branch-1.2.1 instead of upgrading the kryo version. I will put up another patch for branch-1 as soon as possible. > Upgrade Kryo version to 3.0.x > - > > Key: HIVE-12175 > URL: https://issues.apache.org/jira/browse/HIVE-12175 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, > HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, > HIVE-12175.5.patch, HIVE-12175.6.patch > > > Current version of kryo (2.22) has some issue (refer exception below and in > HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We > need to either replace all occurrences of Arrays.asList() or change the > current StdInstantiatorStrategy. This issue is fixed in later versions and > kryo community recommends using DefaultInstantiatorStrategy with fallback to > StdInstantiatorStrategy. More discussion about this issue is here > https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom > serilization/deserilization class can be provided for Arrays.asList. > Also, kryo 3.0 introduced unsafe based serialization which claims to have > much better performance for certain types of serialization. > Exception: > {code} > Caused by: java.lang.NullPointerException > at java.util.Arrays$ArrayList.size(Arrays.java:2847) > at java.util.AbstractList.add(AbstractList.java:108) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > ... 57 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12509) Regenerate q files after HIVE-12017 went in
[ https://issues.apache.org/jira/browse/HIVE-12509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024746#comment-15024746 ] Ashutosh Chauhan commented on HIVE-12509: - +1 > Regenerate q files after HIVE-12017 went in > --- > > Key: HIVE-12509 > URL: https://issues.apache.org/jira/browse/HIVE-12509 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12509.patch > > > A few q files need to be updated, as they were not updated when HIVE-12017 > went in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12484) Show meta operations on HS2 web UI
[ https://issues.apache.org/jira/browse/HIVE-12484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024797#comment-15024797 ] Jimmy Xiang commented on HIVE-12484: Metrics is a good option too. These meta operations should not take too much time, compared to SQL queries. If some operation could take a long time, it is a good candidate to put on the web UI. Right, the priority is lower than the SQL statments. > Show meta operations on HS2 web UI > -- > > Key: HIVE-12484 > URL: https://issues.apache.org/jira/browse/HIVE-12484 > Project: Hive > Issue Type: Sub-task > Components: HiveServer2 >Reporter: Jimmy Xiang > > As Mohit pointed out in the review of HIVE-12338, it is nice to show meta > operations on HS2 web UI too. So that we can have an end-to-end picture for > those operations access HMS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12465) Hive might produce wrong results when (outer) joins are merged
[ https://issues.apache.org/jira/browse/HIVE-12465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15024738#comment-15024738 ] Jesus Camacho Rodriguez commented on HIVE-12465: Sure, I will and update the issue. > Hive might produce wrong results when (outer) joins are merged > -- > > Key: HIVE-12465 > URL: https://issues.apache.org/jira/browse/HIVE-12465 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez >Priority: Blocker > Attachments: HIVE-12465.01.patch, HIVE-12465.02.patch, > HIVE-12465.patch > > > Consider the following query: > {noformat} > select * from > (select * from tab where tab.key = 0)a > full outer join > (select * from tab_part where tab_part.key = 98)b > join > tab_part c > on a.key = b.key and b.key = c.key; > {noformat} > Hive should execute the full outer join operation (without ON clause) and > then the join operation (ON a.key = b.key and b.key = c.key). Instead, it > merges both joins, generating the following plan: > {noformat} > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Map Reduce > Map Operator Tree: > TableScan > alias: tab > filterExpr: (key = 0) (type: boolean) > Statistics: Num rows: 242 Data size: 22748 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: (key = 0) (type: boolean) > Statistics: Num rows: 121 Data size: 11374 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: 0 (type: int), value (type: string), ds (type: > string) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 121 Data size: 11374 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 121 Data size: 11374 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col1 (type: string), _col2 (type: > string) > TableScan > alias: tab_part > filterExpr: (key = 98) (type: boolean) > Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE > Column stats: NONE > Filter Operator > predicate: (key = 98) (type: boolean) > Statistics: Num rows: 250 Data size: 23500 Basic stats: > COMPLETE Column stats: NONE > Select Operator > expressions: 98 (type: int), value (type: string), ds (type: > string) > outputColumnNames: _col0, _col1, _col2 > Statistics: Num rows: 250 Data size: 23500 Basic stats: > COMPLETE Column stats: NONE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 250 Data size: 23500 Basic stats: > COMPLETE Column stats: NONE > value expressions: _col1 (type: string), _col2 (type: > string) > TableScan > alias: c > Statistics: Num rows: 500 Data size: 47000 Basic stats: COMPLETE > Column stats: NONE > Reduce Output Operator > key expressions: key (type: int) > sort order: + > Map-reduce partition columns: key (type: int) > Statistics: Num rows: 500 Data size: 47000 Basic stats: > COMPLETE Column stats: NONE > value expressions: value (type: string), ds (type: string) > Reduce Operator Tree: > Join Operator > condition map: >Outer Join 0 to 1 >Inner Join 1 to 2 > keys: > 0 _col0 (type: int) > 1 _col0 (type: int) > 2 key (type: int) > outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, > _col7, _col8 > Statistics: Num rows: 1100 Data size: 103400 Basic stats: COMPLETE > Column stats: NONE > File Output Operator > compressed: false > Statistics: Num rows: 1100 Data size: 103400 Basic stats: > COMPLETE Column stats: NONE > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
[jira] [Updated] (HIVE-12509) Regenerate q files after HIVE-12017 went in
[ https://issues.apache.org/jira/browse/HIVE-12509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12509: --- Fix Version/s: 2.0.0 > Regenerate q files after HIVE-12017 went in > --- > > Key: HIVE-12509 > URL: https://issues.apache.org/jira/browse/HIVE-12509 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Fix For: 2.0.0 > > Attachments: HIVE-12509.patch > > > A few q files need to be updated, as they were not updated when HIVE-12017 > went in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-12509) Regenerate q files after HIVE-12017 went in
[ https://issues.apache.org/jira/browse/HIVE-12509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jesus Camacho Rodriguez updated HIVE-12509: --- Attachment: HIVE-12509.patch [~ashutoshc], could you +1? It is just q file updates that I miss when I checked in HIVE-12017. Thanks > Regenerate q files after HIVE-12017 went in > --- > > Key: HIVE-12509 > URL: https://issues.apache.org/jira/browse/HIVE-12509 > Project: Hive > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Jesus Camacho Rodriguez >Assignee: Jesus Camacho Rodriguez > Attachments: HIVE-12509.patch > > > A few q files need to be updated, as they were not updated when HIVE-12017 > went in. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12175) Upgrade Kryo version to 3.0.x
[ https://issues.apache.org/jira/browse/HIVE-12175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025108#comment-15025108 ] Ashutosh Chauhan commented on HIVE-12175: - Forgot to ask, which classes need to be registered. If user is adding a udf with her classes. Will that work since her new classes are not gonna be registered with serializer. > Upgrade Kryo version to 3.0.x > - > > Key: HIVE-12175 > URL: https://issues.apache.org/jira/browse/HIVE-12175 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 2.0.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.0.0 > > Attachments: HIVE-12175.1.patch, HIVE-12175.2.patch, > HIVE-12175.3.patch, HIVE-12175.3.patch, HIVE-12175.4.patch, > HIVE-12175.5.patch, HIVE-12175.6.patch > > > Current version of kryo (2.22) has some issue (refer exception below and in > HIVE-12174) with serializing ArrayLists generated using Arrays.asList(). We > need to either replace all occurrences of Arrays.asList() or change the > current StdInstantiatorStrategy. This issue is fixed in later versions and > kryo community recommends using DefaultInstantiatorStrategy with fallback to > StdInstantiatorStrategy. More discussion about this issue is here > https://github.com/EsotericSoftware/kryo/issues/216. Alternatively, custom > serilization/deserilization class can be provided for Arrays.asList. > Also, kryo 3.0 introduced unsafe based serialization which claims to have > much better performance for certain types of serialization. > Exception: > {code} > Caused by: java.lang.NullPointerException > at java.util.Arrays$ArrayList.size(Arrays.java:2847) > at java.util.AbstractList.add(AbstractList.java:108) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:112) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) > at > org.apache.hive.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:694) > at > org.apache.hive.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:106) > ... 57 more > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12399) Native Vector MapJoin can encounter "Null key not expected in MapJoin" and "Unexpected NULL in map join small table" exceptions
[ https://issues.apache.org/jira/browse/HIVE-12399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025125#comment-15025125 ] Sergey Shelukhin commented on HIVE-12399: - +1 > Native Vector MapJoin can encounter "Null key not expected in MapJoin" and > "Unexpected NULL in map join small table" exceptions > > > Key: HIVE-12399 > URL: https://issues.apache.org/jira/browse/HIVE-12399 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-12399.01.patch, HIVE-12399.02.patch > > > Instead of throw exception, just filter out NULLs in the Native Vector > MapJoin operators. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-12483) Fix precommit Spark test branch
[ https://issues.apache.org/jira/browse/HIVE-12483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025143#comment-15025143 ] Hive QA commented on HIVE-12483: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12773549/HIVE-12483.1-spark.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 9788 tests executed *Failed tests:* {noformat} TestHWISessionManager - did not produce a TEST-*.xml file org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_annotate_stats_groupby org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_constantPropagateForSubQuery org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver_encryption_insert_partition_dynamic org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority2 org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_authorization_uri_import org.apache.hadoop.hive.metastore.txn.TestCompactionTxnHandler.testRevokeTimedOutWorkers org.apache.hive.jdbc.TestSSL.testSSLVersion {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1012/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/1012/console Test logs: http://ec2-50-18-27-0.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-1012/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 10 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12773549 - PreCommit-HIVE-SPARK-Build > Fix precommit Spark test branch > --- > > Key: HIVE-12483 > URL: https://issues.apache.org/jira/browse/HIVE-12483 > Project: Hive > Issue Type: Task >Reporter: Sergio Peña >Assignee: Sergio Peña > Attachments: HIVE-12483.1-spark.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)