[jira] [Commented] (HIVE-2858) Cache remote map reduce job stack traces for additional logging
[ https://issues.apache.org/jira/browse/HIVE-2858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249218#comment-13249218 ] Hudson commented on HIVE-2858: -- Integrated in Hive-trunk-h0.21 #1360 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1360/]) HIVE-2858 Cache remote map reduce job stack traces for additional logging (Kevin Wilfong via namit) (Revision 1310583) Result = ABORTED namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1310583 Files : * /hive/trunk/build-common.xml * /hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java * /hive/trunk/conf/hive-default.xml.template * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/JobDebugger.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/errors/TaskLogProcessor.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/hooks/VerifySessionStateStackTracesHook.java * /hive/trunk/ql/src/test/queries/clientnegative/mapreduce_stack_trace.q * /hive/trunk/ql/src/test/queries/clientnegative/mapreduce_stack_trace_turnoff.q * /hive/trunk/ql/src/test/results/clientnegative/mapreduce_stack_trace.q.out * /hive/trunk/ql/src/test/results/clientnegative/mapreduce_stack_trace_turnoff.q.out Cache remote map reduce job stack traces for additional logging --- Key: HIVE-2858 URL: https://issues.apache.org/jira/browse/HIVE-2858 Project: Hive Issue Type: Improvement Components: Logging Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-2858.D2223.1.patch, HIVE-2858.D2223.2.patch Currently we are parsing the task logs for failed jobs for information to display to the user in the CLI. In addition, we could parse those logs for stack traces and store e them in the SessionState. This way, when we log failed queries, these will give us a decent idea of why those queries failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2929) race condition in DAG execute tasks for hive
[ https://issues.apache.org/jira/browse/HIVE-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249219#comment-13249219 ] Hudson commented on HIVE-2929: -- Integrated in Hive-trunk-h0.21 #1360 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1360/]) HIVE-2929 race condition in DAG execute tasks for hive (Namit Jain via Siying Dong) (Revision 1310619) Result = ABORTED sdong : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1310619 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java race condition in DAG execute tasks for hive Key: HIVE-2929 URL: https://issues.apache.org/jira/browse/HIVE-2929 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.2929.1.patch select ... ( SubQuery involving MapReduce union all SubQuery involving MapReduce ); or select ... (SubQuery involving MapReduce) join (SubQuery involving MapReduce) ; If both the subQueries finish at nearly the same time, there is a race condition in which the results of the subQuery finishing last will be completely missed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2910) Improve the HWI interface
[ https://issues.apache.org/jira/browse/HIVE-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249297#comment-13249297 ] Ashutosh Chauhan commented on HIVE-2910: Ran the test with patch. All passed. I don't much about this part of code, but since Ed (who is the expert in this area) has already +1ed, will commit it. Improve the HWI interface - Key: HIVE-2910 URL: https://issues.apache.org/jira/browse/HIVE-2910 Project: Hive Issue Type: Improvement Components: Web UI Reporter: Hugo Trippaers Assignee: Hugo Trippaers Priority: Minor Labels: newbie, patch Attachments: hive-2910.3.patch.log, hive-2910.3.patch.txt, hive-hwi-2.patch, hive-hwi.patch, screenie001.PNG, screenie002.PNG I've made some improvements to the HWI interface with the Twitter bootstrap system. I'm looking for feedback on the new design. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2910) Improve the HWI interface
[ https://issues.apache.org/jira/browse/HIVE-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249298#comment-13249298 ] Ashutosh Chauhan commented on HIVE-2910: Seems like there are few binary resources (icon png images) which needs to be checked-in but aren't uploaded yet. Hugo, can you upload them here with license granted to ASF. Improve the HWI interface - Key: HIVE-2910 URL: https://issues.apache.org/jira/browse/HIVE-2910 Project: Hive Issue Type: Improvement Components: Web UI Reporter: Hugo Trippaers Assignee: Hugo Trippaers Priority: Minor Labels: newbie, patch Attachments: hive-2910.3.patch.log, hive-2910.3.patch.txt, hive-hwi-2.patch, hive-hwi.patch, screenie001.PNG, screenie002.PNG I've made some improvements to the HWI interface with the Twitter bootstrap system. I'm looking for feedback on the new design. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2767) Optionally use framed transport with metastore
[ https://issues.apache.org/jira/browse/HIVE-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249300#comment-13249300 ] Phabricator commented on HIVE-2767: --- ashutoshc has requested changes to the revision HIVE-2767 [jira] Optionally use framed transport with metastore. What will happen if the config variable is set to true in client, but is false in server and vice-versa? INLINE COMMENTS common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:269 This also needs to be added in conf/hive-default.xml.template which serve as a documentation. metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:3020 In your earlier version of patch you were throwing exception in case both secure mode as well as framed transport is set to true. If you have tested with both true and it works then its fine, otherwise we should throw an exception here. REVISION DETAIL https://reviews.facebook.net/D2661 BRANCH HIVE-2767_optional_framed_transport Optionally use framed transport with metastore -- Key: HIVE-2767 URL: https://issues.apache.org/jira/browse/HIVE-2767 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-2767.D2661.1.patch, HIVE-2767.patch.txt, HIVE-2767_a.patch.txt Users may want/need to use thrift's framed transport when communicating with the Hive MetaStore. This patch adds a new property {{hive.metastore.thrift.framed.transport.enabled}} that enables the framed transport (defaults to off, aka no change from before the patch). This property must be set for both clients and the HMS server. It wasn't immediately clear how to use the framed transport with SASL, so as written an exception is thrown if you try starting the server with both options. If SASL and the framed transport will indeed work together I can update the patch (although I don't have a secured environment to test in). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1721) use bloom filters to improve the performance of joins
[ https://issues.apache.org/jira/browse/HIVE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249310#comment-13249310 ] Ashutosh Chauhan commented on HIVE-1721: @Alex, Reading the previous comments on jira, this is proposed to work as follows: * Create a local task and launch it on client machine, building a bloom filter on medium-sized table. (~200MB) * Create a Common Join MR job and launch it on cluster. Also, ship the bloom filter built in previous step to all the mapper nodes (via Distributed Cache). * In Mapper, look-up key of every row of large table in bloom filter. If it exists, then send that row to reducer, else filter it out. * In reducer, do the cross-product of rows of different table for a given key to get your joined output. As outlined above, it will be a win since you will be shuffling much less data from mappers to reducers. Though assumptions are cost of building bloom filter on client machine is small, there is huge difference in sizes of two tables and the join key is highly selective. One or more of these assumptions may be wrong in which case there might be a performance loss. So, there is a trade-off when to use this. I don't know if there exists a way to compute bloom filter in distributed fashion. If there is such a way, then you can do the step 1 through a MR job (instead of locally) and on a much larger table and then launch second MR job to do step 2 3. Again, there will be trade-offs here. use bloom filters to improve the performance of joins - Key: HIVE-1721 URL: https://issues.apache.org/jira/browse/HIVE-1721 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Labels: gsoc, gsoc2012, optimization In case of map-joins, it is likely that the big table will not find many matching rows from the small table. Currently, we perform a hash-map lookup for every row in the big table, which can be pretty expensive. It might be useful to try out a bloom-filter containing all the elements in the small table. Each element from the big table is first searched in the bloom filter, and only in case of a positive match, the small table hash table is explored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1721) use bloom filters to improve the performance of joins
[ https://issues.apache.org/jira/browse/HIVE-1721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249311#comment-13249311 ] Ashutosh Chauhan commented on HIVE-1721: Last line should read then launch second MR job to do step 3 4 use bloom filters to improve the performance of joins - Key: HIVE-1721 URL: https://issues.apache.org/jira/browse/HIVE-1721 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Labels: gsoc, gsoc2012, optimization In case of map-joins, it is likely that the big table will not find many matching rows from the small table. Currently, we perform a hash-map lookup for every row in the big table, which can be pretty expensive. It might be useful to try out a bloom-filter containing all the elements in the small table. Each element from the big table is first searched in the bloom filter, and only in case of a positive match, the small table hash table is explored. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2585) Collapse hive.metastore.uris and hive.metastore.local
[ https://issues.apache.org/jira/browse/HIVE-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-2585: --- Status: Patch Available (was: Open) Collapse hive.metastore.uris and hive.metastore.local - Key: HIVE-2585 URL: https://issues.apache.org/jira/browse/HIVE-2585 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-2585.D2559.1.patch, HIVE-2585.D2559.2.patch We should just have hive.metastore.uris. If it is empty, we shall assume local mode, if non-empty we shall use that string to connect to remote metastore. Having two different keys for same information is confusing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-2883) Metastore client doesnt close connection properly
[ https://issues.apache.org/jira/browse/HIVE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan reassigned HIVE-2883: -- Assignee: Ashutosh Chauhan Metastore client doesnt close connection properly - Key: HIVE-2883 URL: https://issues.apache.org/jira/browse/HIVE-2883 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.9.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.9.0 Attachments: HIVE-2883.D2613.1.patch While closing connection, it always fail with following trace. Seemingly, it doesnt have any harmful effects. {code} 12/03/20 10:55:02 ERROR hive.metastore: Unable to shutdown local metastore client org.apache.thrift.transport.TTransportException: Cannot write to null outputStream at org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142) at org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:163) at org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:91) at org.apache.thrift.TServiceClient.sendBase(TServiceClient.java:62) at com.facebook.fb303.FacebookService$Client.send_shutdown(FacebookService.java:421) at com.facebook.fb303.FacebookService$Client.shutdown(FacebookService.java:415) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:310) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2767) Optionally use framed transport with metastore
[ https://issues.apache.org/jira/browse/HIVE-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249411#comment-13249411 ] Phabricator commented on HIVE-2767: --- travis has commented on the revision HIVE-2767 [jira] Optionally use framed transport with metastore. If just one side of the connection is using the framed transport the connection will fail, likely after the timeout. As this is something that is setup just once per site its not likely to change often, and it defaults off, so the risk of casual misconfiguration is low. Any suggestions on how to better clue the user in that the transport may be the issue? INLINE COMMENTS common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:269 Good suggestion, done. metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:3020 Good suggestion. I don't have an environment to test SASL in, so I'll add the check back. REVISION DETAIL https://reviews.facebook.net/D2661 BRANCH HIVE-2767_optional_framed_transport Optionally use framed transport with metastore -- Key: HIVE-2767 URL: https://issues.apache.org/jira/browse/HIVE-2767 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-2767.D2661.1.patch, HIVE-2767.patch.txt, HIVE-2767_a.patch.txt Users may want/need to use thrift's framed transport when communicating with the Hive MetaStore. This patch adds a new property {{hive.metastore.thrift.framed.transport.enabled}} that enables the framed transport (defaults to off, aka no change from before the patch). This property must be set for both clients and the HMS server. It wasn't immediately clear how to use the framed transport with SASL, so as written an exception is thrown if you try starting the server with both options. If SASL and the framed transport will indeed work together I can update the patch (although I don't have a secured environment to test in). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2767) Optionally use framed transport with metastore
[ https://issues.apache.org/jira/browse/HIVE-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2767: -- Attachment: HIVE-2767.D2661.2.patch travis updated the revision HIVE-2767 [jira] Optionally use framed transport with metastore. Reviewers: JIRA, ashutoshc - REVISION DETAIL https://reviews.facebook.net/D2661 AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java conf/hive-default.xml.template metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java Optionally use framed transport with metastore -- Key: HIVE-2767 URL: https://issues.apache.org/jira/browse/HIVE-2767 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-2767.D2661.1.patch, HIVE-2767.D2661.2.patch, HIVE-2767.patch.txt, HIVE-2767_a.patch.txt Users may want/need to use thrift's framed transport when communicating with the Hive MetaStore. This patch adds a new property {{hive.metastore.thrift.framed.transport.enabled}} that enables the framed transport (defaults to off, aka no change from before the patch). This property must be set for both clients and the HMS server. It wasn't immediately clear how to use the framed transport with SASL, so as written an exception is thrown if you try starting the server with both options. If SASL and the framed transport will indeed work together I can update the patch (although I don't have a secured environment to test in). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2767) Optionally use framed transport with metastore
[ https://issues.apache.org/jira/browse/HIVE-2767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-2767: -- Attachment: HIVE-2767.D2661.3.patch travis updated the revision HIVE-2767 [jira] Optionally use framed transport with metastore. Reviewers: JIRA, ashutoshc Update per comments from review. Add option to hive-default.xml, and disallow framed transport when using SASL as that has not been tested. REVISION DETAIL https://reviews.facebook.net/D2661 AFFECTED FILES common/src/java/org/apache/hadoop/hive/conf/HiveConf.java conf/hive-default.xml.template metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java Optionally use framed transport with metastore -- Key: HIVE-2767 URL: https://issues.apache.org/jira/browse/HIVE-2767 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Travis Crawford Assignee: Travis Crawford Attachments: HIVE-2767.D2661.1.patch, HIVE-2767.D2661.2.patch, HIVE-2767.D2661.3.patch, HIVE-2767.patch.txt, HIVE-2767_a.patch.txt Users may want/need to use thrift's framed transport when communicating with the Hive MetaStore. This patch adds a new property {{hive.metastore.thrift.framed.transport.enabled}} that enables the framed transport (defaults to off, aka no change from before the patch). This property must be set for both clients and the HMS server. It wasn't immediately clear how to use the framed transport with SASL, so as written an exception is thrown if you try starting the server with both options. If SASL and the framed transport will indeed work together I can update the patch (although I don't have a secured environment to test in). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
HiveServer2 Thrift API Proposal
Hi, I wrote up a proposal for a new HiveServer2 Thrift API. It's available on the wiki here: https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API I'd appreciate it greatly if all interested parties would review the proposal and respond with feedback. We can discuss any remaining issues at the next Hive Contributors Meeting on April 18th. Thanks. Carl
[jira] [Commented] (HIVE-80) Add testcases for concurrent query execution
[ https://issues.apache.org/jira/browse/HIVE-80?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13249465#comment-13249465 ] Carl Steinbach commented on HIVE-80: HiveServer can't support concurrent connections due to a limitation of the current HiveServer Thrift API. There's a proposal for a new HiveServer2 Thrift API which fixes these problems located here: https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Thrift+API Add testcases for concurrent query execution Key: HIVE-80 URL: https://issues.apache.org/jira/browse/HIVE-80 Project: Hive Issue Type: Test Components: Query Processor, Server Infrastructure Reporter: Raghotham Murthy Assignee: Carl Steinbach Priority: Critical Labels: concurrency Attachments: hive_input_format_race-2.patch Can use one driver object per query. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira