[jira] Updated: (HIVE-1423) Remove Thrift/FB303 headers/src from Hive source tree
[ https://issues.apache.org/jira/browse/HIVE-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-1423: - Component/s: Server Infrastructure > Remove Thrift/FB303 headers/src from Hive source tree > - > > Key: HIVE-1423 > URL: https://issues.apache.org/jira/browse/HIVE-1423 > Project: Hadoop Hive > Issue Type: Bug > Components: Clients, Server Infrastructure >Reporter: Carl Steinbach > > There is a fair amount of code from the Thrift and fb303 libraries that was > checked into the Hive source tree as part of HIVE-73. This code should be > removed and the odbc driver Makefile should be reworked to depend on the > contents of THRIFT_HOME and FB303_HOME as defined by the user. > {code} > ./service/include/thrift/concurrency/Exception.h > ./service/include/thrift/concurrency/FunctionRunner.h > ./service/include/thrift/concurrency/Monitor.h > ./service/include/thrift/concurrency/Mutex.h > ./service/include/thrift/concurrency/PosixThreadFactory.h > ./service/include/thrift/concurrency/Thread.h > ./service/include/thrift/concurrency/ThreadManager.h > ./service/include/thrift/concurrency/TimerManager.h > ./service/include/thrift/concurrency/Util.h > ./service/include/thrift/config.h > ./service/include/thrift/fb303/FacebookBase.h > ./service/include/thrift/fb303/FacebookService.cpp > ./service/include/thrift/fb303/FacebookService.h > ./service/include/thrift/fb303/fb303_constants.cpp > ./service/include/thrift/fb303/fb303_constants.h > ./service/include/thrift/fb303/fb303_types.cpp > ./service/include/thrift/fb303/fb303_types.h > ./service/include/thrift/fb303/if/fb303.thrift > ./service/include/thrift/fb303/out > ./service/include/thrift/fb303/ServiceTracker.h > ./service/include/thrift/if/reflection_limited.thrift > ./service/include/thrift/processor/PeekProcessor.h > ./service/include/thrift/processor/StatsProcessor.h > ./service/include/thrift/protocol/TBase64Utils.h > ./service/include/thrift/protocol/TBinaryProtocol.h > ./service/include/thrift/protocol/TCompactProtocol.h > ./service/include/thrift/protocol/TDebugProtocol.h > ./service/include/thrift/protocol/TDenseProtocol.h > ./service/include/thrift/protocol/TJSONProtocol.h > ./service/include/thrift/protocol/TOneWayProtocol.h > ./service/include/thrift/protocol/TProtocol.h > ./service/include/thrift/protocol/TProtocolException.h > ./service/include/thrift/protocol/TProtocolTap.h > ./service/include/thrift/reflection_limited_types.h > ./service/include/thrift/server/TNonblockingServer.h > ./service/include/thrift/server/TServer.h > ./service/include/thrift/server/TSimpleServer.h > ./service/include/thrift/server/TThreadedServer.h > ./service/include/thrift/server/TThreadPoolServer.h > ./service/include/thrift/Thrift.h > ./service/include/thrift/TLogging.h > ./service/include/thrift/TProcessor.h > ./service/include/thrift/transport/TBufferTransports.h > ./service/include/thrift/transport/TFDTransport.h > ./service/include/thrift/transport/TFileTransport.h > ./service/include/thrift/transport/THttpClient.h > ./service/include/thrift/transport/TServerSocket.h > ./service/include/thrift/transport/TServerTransport.h > ./service/include/thrift/transport/TShortReadTransport.h > ./service/include/thrift/transport/TSimpleFileTransport.h > ./service/include/thrift/transport/TSocket.h > ./service/include/thrift/transport/TSocketPool.h > ./service/include/thrift/transport/TTransport.h > ./service/include/thrift/transport/TTransportException.h > ./service/include/thrift/transport/TTransportUtils.h > ./service/include/thrift/transport/TZlibTransport.h > ./service/include/thrift/TReflectionLocal.h > ./service/lib/php/autoload.php > ./service/lib/php/ext/thrift_protocol > ./service/lib/php/ext/thrift_protocol/config.m4 > ./service/lib/php/ext/thrift_protocol/php_thrift_protocol.cpp > ./service/lib/php/ext/thrift_protocol/php_thrift_protocol.h > ./service/lib/php/ext/thrift_protocol/tags/1.0.0/config.m4 > ./service/lib/php/ext/thrift_protocol/tags/1.0.0/php_thrift_protocol.cpp > ./service/lib/php/ext/thrift_protocol/tags/1.0.0/php_thrift_protocol.h > ./service/lib/php/packages/fb303/FacebookService.php > ./service/lib/php/packages/fb303/fb303_types.php > ./service/lib/php/protocol/TBinaryProtocol.php > ./service/lib/php/protocol/TProtocol.php > ./service/lib/php/Thrift.php > ./service/lib/php/transport/TBufferedTransport.php > ./service/lib/php/transport/TFramedTransport.php > ./service/lib/php/transport/THttpClient.php > ./service/lib/php/transport/TMemoryBuffer.php > ./service/lib/php/transport/TNullTransport.php > ./service/lib/php/transport/TPhpStream.php > ./service/lib/php/transport/TSocket.php > ./service/lib/php/transport/TSocketPool.php > ./service/lib/php/transport/TTransport.php > ./service/lib/py/fb303/__init__.py > ./service/lib/py/fb303/cons
[jira] Updated: (HIVE-73) Thrift Server and Client for Hive
[ https://issues.apache.org/jira/browse/HIVE-73?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-73: --- Fix Version/s: 0.3.0 (was: 0.6.0) > Thrift Server and Client for Hive > - > > Key: HIVE-73 > URL: https://issues.apache.org/jira/browse/HIVE-73 > Project: Hadoop Hive > Issue Type: New Feature > Components: Clients, Server Infrastructure >Reporter: Raghotham Murthy >Assignee: Raghotham Murthy > Fix For: 0.3.0 > > Attachments: hive-73.1.patch, hive-73.10.patch, hive-73.11.patch, > hive-73.12.patch, hive-73.2.patch, hive-73.3.txt, hive-73.4.txt, > hive-73.5.txt, hive-73.6.patch, hive-73.7.patch, hive-73.8.patch, > hive-73.9.patch > > > Currently the hive cli directly calls the driver code. We need to be able to > run a stand alone hive server that multiple clients can connect to. The hive > server will allow clients to run queries as well as make meta data calls (by > inheriting from the thrift metastore server) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-73) Thrift Server and Client for Hive
[ https://issues.apache.org/jira/browse/HIVE-73?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-73: --- Component/s: Clients > Thrift Server and Client for Hive > - > > Key: HIVE-73 > URL: https://issues.apache.org/jira/browse/HIVE-73 > Project: Hadoop Hive > Issue Type: New Feature > Components: Clients, Server Infrastructure >Reporter: Raghotham Murthy >Assignee: Raghotham Murthy > Fix For: 0.3.0 > > Attachments: hive-73.1.patch, hive-73.10.patch, hive-73.11.patch, > hive-73.12.patch, hive-73.2.patch, hive-73.3.txt, hive-73.4.txt, > hive-73.5.txt, hive-73.6.patch, hive-73.7.patch, hive-73.8.patch, > hive-73.9.patch > > > Currently the hive cli directly calls the driver code. We need to be able to > run a stand alone hive server that multiple clients can connect to. The hive > server will allow clients to run queries as well as make meta data calls (by > inheriting from the thrift metastore server) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1423) Remove Thrift/FB303 headers/src from Hive source tree
Remove Thrift/FB303 headers/src from Hive source tree - Key: HIVE-1423 URL: https://issues.apache.org/jira/browse/HIVE-1423 Project: Hadoop Hive Issue Type: Bug Components: Clients Reporter: Carl Steinbach There is a fair amount of code from the Thrift and fb303 libraries that was checked into the Hive source tree as part of HIVE-73. This code should be removed and the odbc driver Makefile should be reworked to depend on the contents of THRIFT_HOME and FB303_HOME as defined by the user. {code} ./service/include/thrift/concurrency/Exception.h ./service/include/thrift/concurrency/FunctionRunner.h ./service/include/thrift/concurrency/Monitor.h ./service/include/thrift/concurrency/Mutex.h ./service/include/thrift/concurrency/PosixThreadFactory.h ./service/include/thrift/concurrency/Thread.h ./service/include/thrift/concurrency/ThreadManager.h ./service/include/thrift/concurrency/TimerManager.h ./service/include/thrift/concurrency/Util.h ./service/include/thrift/config.h ./service/include/thrift/fb303/FacebookBase.h ./service/include/thrift/fb303/FacebookService.cpp ./service/include/thrift/fb303/FacebookService.h ./service/include/thrift/fb303/fb303_constants.cpp ./service/include/thrift/fb303/fb303_constants.h ./service/include/thrift/fb303/fb303_types.cpp ./service/include/thrift/fb303/fb303_types.h ./service/include/thrift/fb303/if/fb303.thrift ./service/include/thrift/fb303/out ./service/include/thrift/fb303/ServiceTracker.h ./service/include/thrift/if/reflection_limited.thrift ./service/include/thrift/processor/PeekProcessor.h ./service/include/thrift/processor/StatsProcessor.h ./service/include/thrift/protocol/TBase64Utils.h ./service/include/thrift/protocol/TBinaryProtocol.h ./service/include/thrift/protocol/TCompactProtocol.h ./service/include/thrift/protocol/TDebugProtocol.h ./service/include/thrift/protocol/TDenseProtocol.h ./service/include/thrift/protocol/TJSONProtocol.h ./service/include/thrift/protocol/TOneWayProtocol.h ./service/include/thrift/protocol/TProtocol.h ./service/include/thrift/protocol/TProtocolException.h ./service/include/thrift/protocol/TProtocolTap.h ./service/include/thrift/reflection_limited_types.h ./service/include/thrift/server/TNonblockingServer.h ./service/include/thrift/server/TServer.h ./service/include/thrift/server/TSimpleServer.h ./service/include/thrift/server/TThreadedServer.h ./service/include/thrift/server/TThreadPoolServer.h ./service/include/thrift/Thrift.h ./service/include/thrift/TLogging.h ./service/include/thrift/TProcessor.h ./service/include/thrift/transport/TBufferTransports.h ./service/include/thrift/transport/TFDTransport.h ./service/include/thrift/transport/TFileTransport.h ./service/include/thrift/transport/THttpClient.h ./service/include/thrift/transport/TServerSocket.h ./service/include/thrift/transport/TServerTransport.h ./service/include/thrift/transport/TShortReadTransport.h ./service/include/thrift/transport/TSimpleFileTransport.h ./service/include/thrift/transport/TSocket.h ./service/include/thrift/transport/TSocketPool.h ./service/include/thrift/transport/TTransport.h ./service/include/thrift/transport/TTransportException.h ./service/include/thrift/transport/TTransportUtils.h ./service/include/thrift/transport/TZlibTransport.h ./service/include/thrift/TReflectionLocal.h ./service/lib/php/autoload.php ./service/lib/php/ext/thrift_protocol ./service/lib/php/ext/thrift_protocol/config.m4 ./service/lib/php/ext/thrift_protocol/php_thrift_protocol.cpp ./service/lib/php/ext/thrift_protocol/php_thrift_protocol.h ./service/lib/php/ext/thrift_protocol/tags/1.0.0/config.m4 ./service/lib/php/ext/thrift_protocol/tags/1.0.0/php_thrift_protocol.cpp ./service/lib/php/ext/thrift_protocol/tags/1.0.0/php_thrift_protocol.h ./service/lib/php/packages/fb303/FacebookService.php ./service/lib/php/packages/fb303/fb303_types.php ./service/lib/php/protocol/TBinaryProtocol.php ./service/lib/php/protocol/TProtocol.php ./service/lib/php/Thrift.php ./service/lib/php/transport/TBufferedTransport.php ./service/lib/php/transport/TFramedTransport.php ./service/lib/php/transport/THttpClient.php ./service/lib/php/transport/TMemoryBuffer.php ./service/lib/php/transport/TNullTransport.php ./service/lib/php/transport/TPhpStream.php ./service/lib/php/transport/TSocket.php ./service/lib/php/transport/TSocketPool.php ./service/lib/php/transport/TTransport.php ./service/lib/py/fb303/__init__.py ./service/lib/py/fb303/constants.py ./service/lib/py/fb303/FacebookBase.py ./service/lib/py/fb303/FacebookService-remote ./service/lib/py/fb303/FacebookService.py ./service/lib/py/fb303/ttypes.py ./service/lib/py/fb303_scripts/__init__.py ./service/lib/py/fb303_scripts/fb303_simple_mgmt.py ./service/lib/py/thrift/__init__.py ./service/lib/py/thrift/protocol ./service/lib/py/thrift/protocol/__init__.py ./service/lib/py/thrift/protocol/fastbinary.c ./service/lib/py/
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881026#action_12881026 ] Paul Yang commented on HIVE-1176: - I was going to look at it again today, but looks like I'll get to it around mid-day tomorrow? Will keep this posted. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1271) Case sensitiveness of type information specified when using custom reducer causes type mismatch
[ https://issues.apache.org/jira/browse/HIVE-1271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881024#action_12881024 ] Arvind Prabhakar commented on HIVE-1271: Is anyone reviewing this change? Thanks. > Case sensitiveness of type information specified when using custom reducer > causes type mismatch > --- > > Key: HIVE-1271 > URL: https://issues.apache.org/jira/browse/HIVE-1271 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.5.0 >Reporter: Dilip Joseph >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1271-1.patch, HIVE-1271.patch > > > Type information specified while using a custom reduce script is converted > to lower case, and causes type mismatch during query semantic analysis . The > following REDUCE query where field name = "userId" failed. > hive> CREATE TABLE SS ( >> a INT, >> b INT, >> vals ARRAY> >> ); > OK > hive> FROM (select * from srcTable DISTRIBUTE BY id SORT BY id) s >> INSERT OVERWRITE TABLE SS >> REDUCE * >> USING 'myreduce.py' >> AS >> (a INT, >> b INT, >> vals ARRAY> >> ) >> ; > FAILED: Error in semantic analysis: line 2:27 Cannot insert into > target table because column number/types are different SS: Cannot > convert column 2 from array> to > array>. > The same query worked fine after changing "userId" to "userid". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1176) 'create if not exists' fails for a table name with 'select' in it
[ https://issues.apache.org/jira/browse/HIVE-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881023#action_12881023 ] Arvind Prabhakar commented on HIVE-1176: @Paul: Any updates on this from your end? Thanks. > 'create if not exists' fails for a table name with 'select' in it > - > > Key: HIVE-1176 > URL: https://issues.apache.org/jira/browse/HIVE-1176 > Project: Hadoop Hive > Issue Type: Bug > Components: Metastore, Query Processor >Reporter: Prasad Chakka >Assignee: Arvind Prabhakar > Fix For: 0.6.0 > > Attachments: HIVE-1176-1.patch, HIVE-1176-2.patch, > HIVE-1176.lib-files.tar.gz, HIVE-1176.patch > > > hive> create table if not exists tmp_select(s string, c string, n int); > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Got > exception: javax.jdo.JDOUserException JDOQL Single-String query should always > start with SELECT) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:441) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesByPattern(Hive.java:423) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:5538) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5192) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) > at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) > at > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > Caused by: MetaException(message:Got exception: javax.jdo.JDOUserException > JDOQL Single-String query should always start with SELECT) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.logAndThrowMetaException(MetaStoreUtils.java:612) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:450) > at > org.apache.hadoop.hive.ql.metadata.Hive.getTablesForDb(Hive.java:439) > ... 15 more -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881020#action_12881020 ] Arvind Prabhakar commented on HIVE-287: --- @John: Can you please take a look at the updated patch? Let me know if you have any feedback for further tweaking this change as necessary. > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-287) count distinct on multiple columns does not work
[ https://issues.apache.org/jira/browse/HIVE-287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvind Prabhakar updated HIVE-287: -- Attachment: HIVE-287-4.patch applies cleanly on trunk and branch-0.6 > count distinct on multiple columns does not work > > > Key: HIVE-287 > URL: https://issues.apache.org/jira/browse/HIVE-287 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Reporter: Namit Jain >Assignee: Arvind Prabhakar > Attachments: HIVE-287-1.patch, HIVE-287-2.patch, HIVE-287-3.patch, > HIVE-287-4.patch > > > The following query does not work: > select count(distinct col1, col2) from Tbl -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1422) skip counter update when RunningJob.getCounters() returns null
[ https://issues.apache.org/jira/browse/HIVE-1422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1422: - Attachment: HIVE-1422.1.patch > skip counter update when RunningJob.getCounters() returns null > -- > > Key: HIVE-1422 > URL: https://issues.apache.org/jira/browse/HIVE-1422 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: John Sichi >Assignee: John Sichi > Fix For: 0.7.0 > > Attachments: HIVE-1422.1.patch > > > Under heavy load circumstances on some Hadoop versions, we may get a NPE from > trying to dereference a null Counters object. I don't have a unit test which > can reproduce it, but here's an example stack from a production cluster we > saw today: > 10/06/21 13:01:10 ERROR exec.ExecDriver: Ended Job = job_201005200457_701060 > with exception 'java.lang.NullPointerException(null)' > java.lang.NullPointerException > at org.apache.hadoop.hive.ql.exec.Operator.updateCounters(Operator.java:999) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.updateCounters(ExecDriver.java:503) > at org.apache.hadoop.hive.ql.exec.ExecDriver.progress(ExecDriver.java:390) > at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:697) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107) > at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55) > at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1422) skip counter update when RunningJob.getCounters() returns null
skip counter update when RunningJob.getCounters() returns null -- Key: HIVE-1422 URL: https://issues.apache.org/jira/browse/HIVE-1422 Project: Hadoop Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: John Sichi Fix For: 0.7.0 Under heavy load circumstances on some Hadoop versions, we may get a NPE from trying to dereference a null Counters object. I don't have a unit test which can reproduce it, but here's an example stack from a production cluster we saw today: 10/06/21 13:01:10 ERROR exec.ExecDriver: Ended Job = job_201005200457_701060 with exception 'java.lang.NullPointerException(null)' java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.Operator.updateCounters(Operator.java:999) at org.apache.hadoop.hive.ql.exec.ExecDriver.updateCounters(ExecDriver.java:503) at org.apache.hadoop.hive.ql.exec.ExecDriver.progress(ExecDriver.java:390) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:697) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55) at org.apache.hadoop.hive.ql.exec.TaskRunner.run(TaskRunner.java:47) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1417) Archived partitions throw error with queries calling getContentSummary
[ https://issues.apache.org/jira/browse/HIVE-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881004#action_12881004 ] Namit Jain commented on HIVE-1417: -- will review > Archived partitions throw error with queries calling getContentSummary > -- > > Key: HIVE-1417 > URL: https://issues.apache.org/jira/browse/HIVE-1417 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.6.0, 0.7.0 >Reporter: Paul Yang >Assignee: Paul Yang > Attachments: HIVE-1417.1.patch, HIVE-1417.branch-0.6.1.patch > > > Assuming you have a src table with a ds='1' partition that is archived in > HDFS, the following query will throw an exception > {code} > select count(1) from src where ds='1' group by key; > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-1421) problem with sequence and rcfiles are mixed for null partitions
[ https://issues.apache.org/jira/browse/HIVE-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang resolved HIVE-1421. Hadoop Flags: [Reviewed] Resolution: Fixed namit sent me the patch, and we tested and committed during the jira downtime. Will close this. > problem with sequence and rcfiles are mixed for null partitions > --- > > Key: HIVE-1421 > URL: https://issues.apache.org/jira/browse/HIVE-1421 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.6.0, 0.7.0 >Reporter: He Yongqiang >Assignee: Namit Jain > Fix For: 0.6.0, 0.7.0 > > Attachments: hive.1421.2.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1421) problem with sequence and rcfiles are mixed for null partitions
[ https://issues.apache.org/jira/browse/HIVE-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12881001#action_12881001 ] Namit Jain commented on HIVE-1421: -- patch for 0.6 and trunk > problem with sequence and rcfiles are mixed for null partitions > --- > > Key: HIVE-1421 > URL: https://issues.apache.org/jira/browse/HIVE-1421 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.6.0, 0.7.0 >Reporter: He Yongqiang >Assignee: Namit Jain > Fix For: 0.6.0, 0.7.0 > > Attachments: hive.1421.2.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1421) problem with sequence and rcfiles are mixed for null partitions
[ https://issues.apache.org/jira/browse/HIVE-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1421: - Attachment: hive.1421.2.patch > problem with sequence and rcfiles are mixed for null partitions > --- > > Key: HIVE-1421 > URL: https://issues.apache.org/jira/browse/HIVE-1421 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.6.0, 0.7.0 >Reporter: He Yongqiang >Assignee: Namit Jain > Fix For: 0.6.0, 0.7.0 > > Attachments: hive.1421.2.patch > > -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1412) CombineHiveInputFormat bug on tablesample
[ https://issues.apache.org/jira/browse/HIVE-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-1412: - Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Resolution: Fixed Committed. Thanks Ning > CombineHiveInputFormat bug on tablesample > - > > Key: HIVE-1412 > URL: https://issues.apache.org/jira/browse/HIVE-1412 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.6.0, 0.7.0 >Reporter: Ning Zhang >Assignee: Ning Zhang > Fix For: 0.6.0, 0.7.0 > > Attachments: HIVE-1412.2.patch, HIVE-1412.patch > > > CombineHiveInputFormat should combine all files inside one partition to form > a split but should not takes files cross partition boundary. This works for > regular table and partitions since all input paths are directory. However > this breaks when the input is files (in which case tablesample could be the > use case). CombineHiveInputFormat should adjust to the case when input could > also be non-directories. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HIVE-1420) problem with sequence and rcfiles are mixed for null partitions
[ https://issues.apache.org/jira/browse/HIVE-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang resolved HIVE-1420. Resolution: Duplicate duplicate of 1421 > problem with sequence and rcfiles are mixed for null partitions > --- > > Key: HIVE-1420 > URL: https://issues.apache.org/jira/browse/HIVE-1420 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: Namit Jain >Assignee: He Yongqiang > Fix For: 0.6.0, 0.7.0 > > Attachments: hive-1420.1.patch > > > drop table foo; > create table foo (src int, value string) partitioned by (ds string); > alter table foo set fileformat Sequencefile; > insert overwrite table foo partition (ds='1') > select key, value from src; > alter table foo add partition (ds='2'); > alter table foo set fileformat rcfile; > select count(1) from foo; > The above testcase fails -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1420) problem with sequence and rcfiles are mixed for null partitions
[ https://issues.apache.org/jira/browse/HIVE-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-1420: --- Status: Open (was: Patch Available) > problem with sequence and rcfiles are mixed for null partitions > --- > > Key: HIVE-1420 > URL: https://issues.apache.org/jira/browse/HIVE-1420 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: Namit Jain >Assignee: He Yongqiang > Fix For: 0.6.0, 0.7.0 > > Attachments: hive-1420.1.patch > > > drop table foo; > create table foo (src int, value string) partitioned by (ds string); > alter table foo set fileformat Sequencefile; > insert overwrite table foo partition (ds='1') > select key, value from src; > alter table foo add partition (ds='2'); > alter table foo set fileformat rcfile; > select count(1) from foo; > The above testcase fails -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HIVE-1421) problem with sequence and rcfiles are mixed for null partitions
problem with sequence and rcfiles are mixed for null partitions --- Key: HIVE-1421 URL: https://issues.apache.org/jira/browse/HIVE-1421 Project: Hadoop Hive Issue Type: Bug Affects Versions: 0.6.0, 0.7.0 Reporter: He Yongqiang Assignee: Namit Jain Fix For: 0.6.0, 0.7.0 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1414) automatically invoke .hiverc init script
[ https://issues.apache.org/jira/browse/HIVE-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1414: - Fix Version/s: 0.7.0 > automatically invoke .hiverc init script > > > Key: HIVE-1414 > URL: https://issues.apache.org/jira/browse/HIVE-1414 > Project: Hadoop Hive > Issue Type: Improvement > Components: Clients >Affects Versions: 0.5.0 >Reporter: John Sichi >Assignee: Edward Capriolo > Fix For: 0.7.0 > > Attachments: hive-1414-patch-1.txt > > > Similar to .bashrc but run Hive SQL commands. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1414) automatically invoke .hiverc init script
[ https://issues.apache.org/jira/browse/HIVE-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880992#action_12880992 ] John Sichi commented on HIVE-1414: -- OK, those choices make sense. Some review comments: 1) In HIVE-1405, I added a processFile method which takes care of closing the reader to avoid resource leak. Could you review and commit that patch, and then update your patch here to call processFile? 2) If either getenv or getProperty returns null, we should skip the corresponding exists check completely to avoid looking for a filename like ("null/bin/.hiverc") 3) I think your code needs to move up into my processInitFiles location, otherwise it won't get run for the -f and -e cases. Also, let's say that if -i is specified, then we skip the .hiverc execution (to match bash -init-file behavior). Note that .hiverc execution should happen inside of my silent-mode block so that it does not show up in console output. > automatically invoke .hiverc init script > > > Key: HIVE-1414 > URL: https://issues.apache.org/jira/browse/HIVE-1414 > Project: Hadoop Hive > Issue Type: Improvement > Components: Clients >Affects Versions: 0.5.0 >Reporter: John Sichi >Assignee: Edward Capriolo > Attachments: hive-1414-patch-1.txt > > > Similar to .bashrc but run Hive SQL commands. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1417) Archived partitions throw error with queries calling getContentSummary
[ https://issues.apache.org/jira/browse/HIVE-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1417: Attachment: HIVE-1417.1.patch HIVE-1417.branch-0.6.1.patch Expanded test coverage to include join and group by, but this bug doesn't show up during unit tests because the underlying filesystem is not HDFS. > Archived partitions throw error with queries calling getContentSummary > -- > > Key: HIVE-1417 > URL: https://issues.apache.org/jira/browse/HIVE-1417 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.6.0, 0.7.0 >Reporter: Paul Yang >Assignee: Paul Yang > Attachments: HIVE-1417.1.patch, HIVE-1417.branch-0.6.1.patch > > > Assuming you have a src table with a ds='1' partition that is archived in > HDFS, the following query will throw an exception > {code} > select count(1) from src where ds='1' group by key; > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1405) hive command line option -i to run an init file before other SQL commands
[ https://issues.apache.org/jira/browse/HIVE-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1405: - Fix Version/s: 0.7.0 (was: 0.6.0) > hive command line option -i to run an init file before other SQL commands > - > > Key: HIVE-1405 > URL: https://issues.apache.org/jira/browse/HIVE-1405 > Project: Hadoop Hive > Issue Type: New Feature > Components: Clients >Affects Versions: 0.5.0 >Reporter: Jonathan Chang >Assignee: John Sichi > Fix For: 0.7.0 > > Attachments: HIVE-1405.1.patch > > > When deploying hive, it would be nice to have a .hiverc file containing > statements that would be automatically run whenever hive is launched. This > way, we can automatically add JARs, create temporary functions, set flags, > etc. for all users quickly. > This should ideally be set up like .bashrc and the like with a global version > and a user-local version. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1417) Archived partitions throw error with queries calling getContentSummary
[ https://issues.apache.org/jira/browse/HIVE-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1417: Status: Patch Available (was: Open) Affects Version/s: 0.7.0 > Archived partitions throw error with queries calling getContentSummary > -- > > Key: HIVE-1417 > URL: https://issues.apache.org/jira/browse/HIVE-1417 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.6.0, 0.7.0 >Reporter: Paul Yang >Assignee: Paul Yang > Attachments: HIVE-1417.1.patch, HIVE-1417.branch-0.6.1.patch > > > Assuming you have a src table with a ds='1' partition that is archived in > HDFS, the following query will throw an exception > {code} > select count(1) from src where ds='1' group by key; > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1417) Archived partitions throw error with queries calling getContentSummary
[ https://issues.apache.org/jira/browse/HIVE-1417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-1417: Description: Assuming you have a src table with a ds='1' partition that is archived in HDFS, the following query will throw an exception {code} select count(1) from src where ds='1' group by key; {code} was: Assuming you have a src table with a ds='1' partition that is archived, the following table will throw an exception {code} select count(1) from src where ds='1' group by key; {code} > Archived partitions throw error with queries calling getContentSummary > -- > > Key: HIVE-1417 > URL: https://issues.apache.org/jira/browse/HIVE-1417 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.6.0, 0.7.0 >Reporter: Paul Yang >Assignee: Paul Yang > Attachments: HIVE-1417.1.patch, HIVE-1417.branch-0.6.1.patch > > > Assuming you have a src table with a ds='1' partition that is archived in > HDFS, the following query will throw an exception > {code} > select count(1) from src where ds='1' group by key; > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Re: about to cut the 0.6 release branch...
On Jun 20, 2010, at 5:37 PM, wrote: > * for cases where a single patch is being applied to both trunk and branch, > commit to trunk first, then merge that to branch (rather than reapplying the > patch on branch independently); someone please correct me if I have this > wrong Correction from conversation with Namit: apparently we never use merge, and always apply the patch on the branch directly. JVS
[jira] Updated: (HIVE-1359) Unit test should be shim-aware
[ https://issues.apache.org/jira/browse/HIVE-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1359: - Status: Patch Available (was: Open) Affects Version/s: 0.6.0 0.7.0 Fix Version/s: 0.7.0 > Unit test should be shim-aware > -- > > Key: HIVE-1359 > URL: https://issues.apache.org/jira/browse/HIVE-1359 > Project: Hadoop Hive > Issue Type: New Feature >Affects Versions: 0.6.0, 0.7.0 >Reporter: Ning Zhang >Assignee: Ning Zhang > Fix For: 0.6.0, 0.7.0 > > Attachments: HIVE-1359.patch, unit_tests.txt > > > Some features in Hive only works for certain Hadoop versions through shim. > However the unit test structure is not shim-aware in that there is only one > set of queries and expected outputs for all Hadoop versions. This may not be > sufficient when we will have different output for different Hadoop versions. > One example is CombineHiveInputFormat wich is only available from Hadoop > 0.20. The plan using CombineHiveInputFormat and HiveInputFormat may be > different. Another example is archival partitions (HAR) which is also only > available from 0.20. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1416) Dynamic partition inserts left empty files uncleaned in hadoop 0.17 local mode
[ https://issues.apache.org/jira/browse/HIVE-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1416: - Status: Patch Available (was: Open) Affects Version/s: 0.6.0 0.7.0 Fix Version/s: 0.6.0 0.7.0 > Dynamic partition inserts left empty files uncleaned in hadoop 0.17 local mode > -- > > Key: HIVE-1416 > URL: https://issues.apache.org/jira/browse/HIVE-1416 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.6.0, 0.7.0 >Reporter: Ning Zhang >Assignee: Ning Zhang > Fix For: 0.6.0, 0.7.0 > > Attachments: HIVE-1416.patch > > > Hive parses the file name generated by tasks to figure out the task ID in > order to generate files for empty buckets. Different hadoop versions and > execution mode have different ways of naming output files by > mappers/reducers. We need to move the parsing code to shims. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1359) Unit test should be shim-aware
[ https://issues.apache.org/jira/browse/HIVE-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1359: - Attachment: HIVE-1359.patch > Unit test should be shim-aware > -- > > Key: HIVE-1359 > URL: https://issues.apache.org/jira/browse/HIVE-1359 > Project: Hadoop Hive > Issue Type: New Feature >Reporter: Ning Zhang >Assignee: Ning Zhang > Fix For: 0.6.0 > > Attachments: HIVE-1359.patch, unit_tests.txt > > > Some features in Hive only works for certain Hadoop versions through shim. > However the unit test structure is not shim-aware in that there is only one > set of queries and expected outputs for all Hadoop versions. This may not be > sufficient when we will have different output for different Hadoop versions. > One example is CombineHiveInputFormat wich is only available from Hadoop > 0.20. The plan using CombineHiveInputFormat and HiveInputFormat may be > different. Another example is archival partitions (HAR) which is also only > available from 0.20. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1412) CombineHiveInputFormat bug on tablesample
[ https://issues.apache.org/jira/browse/HIVE-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1412: - Status: Patch Available (was: Open) Affects Version/s: 0.6.0 0.7.0 Fix Version/s: 0.7.0 > CombineHiveInputFormat bug on tablesample > - > > Key: HIVE-1412 > URL: https://issues.apache.org/jira/browse/HIVE-1412 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.6.0, 0.7.0 >Reporter: Ning Zhang >Assignee: Ning Zhang > Fix For: 0.6.0, 0.7.0 > > Attachments: HIVE-1412.2.patch, HIVE-1412.patch > > > CombineHiveInputFormat should combine all files inside one partition to form > a split but should not takes files cross partition boundary. This works for > regular table and partitions since all input paths are directory. However > this breaks when the input is files (in which case tablesample could be the > use case). CombineHiveInputFormat should adjust to the case when input could > also be non-directories. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1416) Dynamic partition inserts left empty files uncleaned in hadoop 0.17 local mode
[ https://issues.apache.org/jira/browse/HIVE-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1416: - Attachment: HIVE-1416.patch > Dynamic partition inserts left empty files uncleaned in hadoop 0.17 local mode > -- > > Key: HIVE-1416 > URL: https://issues.apache.org/jira/browse/HIVE-1416 > Project: Hadoop Hive > Issue Type: Bug >Affects Versions: 0.6.0, 0.7.0 >Reporter: Ning Zhang >Assignee: Ning Zhang > Fix For: 0.6.0, 0.7.0 > > Attachments: HIVE-1416.patch > > > Hive parses the file name generated by tasks to figure out the task ID in > order to generate files for empty buckets. Different hadoop versions and > execution mode have different ways of naming output files by > mappers/reducers. We need to move the parsing code to shims. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1412) CombineHiveInputFormat bug on tablesample
[ https://issues.apache.org/jira/browse/HIVE-1412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ning Zhang updated HIVE-1412: - Attachment: HIVE-1412.2.patch Added a unit test > CombineHiveInputFormat bug on tablesample > - > > Key: HIVE-1412 > URL: https://issues.apache.org/jira/browse/HIVE-1412 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.6.0, 0.7.0 >Reporter: Ning Zhang >Assignee: Ning Zhang > Fix For: 0.6.0, 0.7.0 > > Attachments: HIVE-1412.2.patch, HIVE-1412.patch > > > CombineHiveInputFormat should combine all files inside one partition to form > a split but should not takes files cross partition boundary. This works for > regular table and partitions since all input paths are directory. However > this breaks when the input is files (in which case tablesample could be the > use case). CombineHiveInputFormat should adjust to the case when input could > also be non-directories. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
RE: Latest trunk breaks logging?
543 changes logging for local mode. By default, the logs from local mode map-reduce tasks now go to a per-query log file that's written to /tmp//.log This was very much intended to enable local mode to be a bit friendlier to use. -Original Message- From: Mayank Lahiri [mailto:mayank.lah...@facebook.com] Sent: Monday, June 21, 2010 2:54 PM To: John Sichi Cc: hive-dev@hadoop.apache.org Subject: Re: Latest trunk breaks logging? Hi Joydeep, I've confirmed that logging is restored after reverting HIVE-543. The problem is that LOG.warn() calls from inside UDAFs do not generate any output after applying HIVE-543. For example, passing an invalid argument to histogram() causes a general one-line exception instead of the diagnostic HiveException that is supposed to be thrown. This is what HEAD currently returns: - snip -- FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask end snip Was this intended, or are the logs just being written to some other location post-HIVE-543? Thanks, -Mayank On 6/21/10 1:00 PM, "John Sichi" wrote: +hive-dev Looks like Joydeep made some changes to the logging in recently committed HIVE-543; maybe related? Apache JIRA is down at the moment but I found the patch here: http://mail-archives.apache.org/mod_mbox/hadoop-hive-commits/201006.mbox/%3c20100616225037.c67df2388...@eris.apache.org%3e Mayank, if you can confirm that this is the cause, we can check with Joydeep on whether or not this was intentional. The DataNucleus noise I've been seeing for a while now; I think it's harmless, but go ahead and create a JIRA issue to get it cleaned up. JVS On Jun 21, 2010, at 12:34 PM, Mayank Lahiri wrote: Hi John, I just updated trunk and logging seems to be slightly different, and possibly broken. For one, LOG.warn() messages from inside UDAFs don't show up anywhere, and are not printed to console. /tmp/mlahiri/hive.log contains a lot of lines like this: 2010-06-21 12:02:01,486 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:02:01,487 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. 2010-06-21 12:23:23,576 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.resources" but it cannot be resolved. 2010-06-21 12:23:23,578 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:23:23,579 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. 2010-06-21 12:27:45,003 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.resources" but it cannot be resolved. 2010-06-21 12:27:45,005 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:27:45,005 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. 2010-06-21 12:30:13,786 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.resources" but it cannot be resolved. 2010-06-21 12:30:13,788 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:30:13,789 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. ~ Any ideas what I could be doing differently/wrong? - Mayank
Re: Latest trunk breaks logging?
Hi Joydeep, I’ve confirmed that logging is restored after reverting HIVE-543. The problem is that LOG.warn() calls from inside UDAFs do not generate any output after applying HIVE-543. For example, passing an invalid argument to histogram() causes a general one-line exception instead of the diagnostic HiveException that is supposed to be thrown. This is what HEAD currently returns: - snip -- FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask end snip Was this intended, or are the logs just being written to some other location post-HIVE-543? Thanks, -Mayank On 6/21/10 1:00 PM, "John Sichi" wrote: +hive-dev Looks like Joydeep made some changes to the logging in recently committed HIVE-543; maybe related? Apache JIRA is down at the moment but I found the patch here: http://mail-archives.apache.org/mod_mbox/hadoop-hive-commits/201006.mbox/%3c20100616225037.c67df2388...@eris.apache.org%3e Mayank, if you can confirm that this is the cause, we can check with Joydeep on whether or not this was intentional. The DataNucleus noise I've been seeing for a while now; I think it's harmless, but go ahead and create a JIRA issue to get it cleaned up. JVS On Jun 21, 2010, at 12:34 PM, Mayank Lahiri wrote: Hi John, I just updated trunk and logging seems to be slightly different, and possibly broken. For one, LOG.warn() messages from inside UDAFs don’t show up anywhere, and are not printed to console. /tmp/mlahiri/hive.log contains a lot of lines like this: 2010-06-21 12:02:01,486 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:02:01,487 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. 2010-06-21 12:23:23,576 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.resources" but it cannot be resolved. 2010-06-21 12:23:23,578 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:23:23,579 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. 2010-06-21 12:27:45,003 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.resources" but it cannot be resolved. 2010-06-21 12:27:45,005 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:27:45,005 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. 2010-06-21 12:30:13,786 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.resources" but it cannot be resolved. 2010-06-21 12:30:13,788 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:30:13,789 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. ~ Any ideas what I could be doing differently/wrong? - Mayank
Re: Latest trunk breaks logging?
+hive-dev Looks like Joydeep made some changes to the logging in recently committed HIVE-543; maybe related? Apache JIRA is down at the moment but I found the patch here: http://mail-archives.apache.org/mod_mbox/hadoop-hive-commits/201006.mbox/%3c20100616225037.c67df2388...@eris.apache.org%3e Mayank, if you can confirm that this is the cause, we can check with Joydeep on whether or not this was intentional. The DataNucleus noise I've been seeing for a while now; I think it's harmless, but go ahead and create a JIRA issue to get it cleaned up. JVS On Jun 21, 2010, at 12:34 PM, Mayank Lahiri wrote: Hi John, I just updated trunk and logging seems to be slightly different, and possibly broken. For one, LOG.warn() messages from inside UDAFs don’t show up anywhere, and are not printed to console. /tmp/mlahiri/hive.log contains a lot of lines like this: 2010-06-21 12:02:01,486 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:02:01,487 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. 2010-06-21 12:23:23,576 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.resources" but it cannot be resolved. 2010-06-21 12:23:23,578 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:23:23,579 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. 2010-06-21 12:27:45,003 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.resources" but it cannot be resolved. 2010-06-21 12:27:45,005 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:27:45,005 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. 2010-06-21 12:30:13,786 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.resources" but it cannot be resolved. 2010-06-21 12:30:13,788 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.core.runtime" but it cannot be resolved. 2010-06-21 12:30:13,789 ERROR DataNucleus.Plugin (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.c ore" requires "org.eclipse.text" but it cannot be resolved. ~ Any ideas what I could be doing differently/wrong? - Mayank
[jira] Commented: (HIVE-1419) Policy on deserialization errors
[ https://issues.apache.org/jira/browse/HIVE-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880909#action_12880909 ] Vladimir Klimontovich commented on HIVE-1419: - If it works fine for you now, it won't be broken by this patch. > Policy on deserialization errors > > > Key: HIVE-1419 > URL: https://issues.apache.org/jira/browse/HIVE-1419 > Project: Hadoop Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 0.5.0 >Reporter: Vladimir Klimontovich >Assignee: Vladimir Klimontovich >Priority: Minor > Fix For: 0.5.1, 0.6.0 > > Attachments: corrupted_records_0.5.patch, > corrupted_records_0.5_ver2.patch, corrupted_records_trunk.patch, > corrupted_records_trunk_ver2.patch > > > When deserializer throws an exception the whole map tasks fails (see > MapOperator.java file). It's not always an convenient behavior especially on > huge datasets where several corrupted lines could be a normal practice. > Proposed solution: > 1) Have a counter of corrupted records > 2) When a counter exceeds a limit (configurable via > hive.max.deserializer.errors property, 0 by default) throw an exception. > Otherwise just log and exception with WARN level. > Patches for 0.5 branch and trunk are attached -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1419) Policy on deserialization errors
[ https://issues.apache.org/jira/browse/HIVE-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880908#action_12880908 ] Edward Capriolo commented on HIVE-1419: --- I am looking through this and trying to wrap my head around it. Off hand do you know what happens in this situation. We have a table that we have added columns to over time create table tab (a int, b int); Over time we have added more columns alter table tab (a int, b int, c int) This works fine for us as selecting column c on older data returns null for that column. Will this behaviour be preserved ? > Policy on deserialization errors > > > Key: HIVE-1419 > URL: https://issues.apache.org/jira/browse/HIVE-1419 > Project: Hadoop Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 0.5.0 >Reporter: Vladimir Klimontovich >Assignee: Vladimir Klimontovich >Priority: Minor > Fix For: 0.5.1, 0.6.0 > > Attachments: corrupted_records_0.5.patch, > corrupted_records_0.5_ver2.patch, corrupted_records_trunk.patch, > corrupted_records_trunk_ver2.patch > > > When deserializer throws an exception the whole map tasks fails (see > MapOperator.java file). It's not always an convenient behavior especially on > huge datasets where several corrupted lines could be a normal practice. > Proposed solution: > 1) Have a counter of corrupted records > 2) When a counter exceeds a limit (configurable via > hive.max.deserializer.errors property, 0 by default) throw an exception. > Otherwise just log and exception with WARN level. > Patches for 0.5 branch and trunk are attached -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1420) problem with sequence and rcfiles are mixed for null partitions
[ https://issues.apache.org/jira/browse/HIVE-1420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880904#action_12880904 ] Namit Jain commented on HIVE-1420: -- +1 will commit if the tests pass > problem with sequence and rcfiles are mixed for null partitions > --- > > Key: HIVE-1420 > URL: https://issues.apache.org/jira/browse/HIVE-1420 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor >Affects Versions: 0.6.0 >Reporter: Namit Jain >Assignee: He Yongqiang > Fix For: 0.6.0, 0.7.0 > > Attachments: hive-1420.1.patch > > > drop table foo; > create table foo (src int, value string) partitioned by (ds string); > alter table foo set fileformat Sequencefile; > insert overwrite table foo partition (ds='1') > select key, value from src; > alter table foo add partition (ds='2'); > alter table foo set fileformat rcfile; > select count(1) from foo; > The above testcase fails -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-801) row-wise IN would be useful
[ https://issues.apache.org/jira/browse/HIVE-801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-801: Affects Version/s: 0.5.0 0.4.1 0.4.0 0.3.0 Component/s: Query Processor > row-wise IN would be useful > --- > > Key: HIVE-801 > URL: https://issues.apache.org/jira/browse/HIVE-801 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor >Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0 >Reporter: Adam Kramer >Assignee: Paul Yang > Fix For: 0.6.0 > > Attachments: HIVE-801.1.patch, HIVE-801.2.patch, HIVE-801.3.patch > > > SELECT * FROM tablename t > WHERE IN(12345,key1,key2,key3); > ...IN would operate on a given row, and return True when the first argument > equaled at least one of the other arguments. So here IN would return true if > 12345=key1 OR 12345=key2 OR 12345=key3 (but wouldn't test the latter two if > the first matched). > This would also help with https://issues.apache.org/jira/browse/HIVE-783, if > IN were implemented in a manner that allows it to be used in an ON clause. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
skew join in hive
Hi, I see the skew handling strategy as mentioned in hive-964. Here are some questions. 1. how to get the big keys for a table? Launch a mr job to build histogram on each table? 2. now that we get big/skewed keys, do we also have small/non-skewed keys? Do we process these non-skewed keys in the same way (replicate join), or in the traditional way (redistribution join)? Thanks, -Gang
[jira] Updated: (HIVE-1419) Policy on deserialization errors
[ https://issues.apache.org/jira/browse/HIVE-1419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Klimontovich updated HIVE-1419: Attachment: corrupted_records_0.5_ver2.patch corrupted_records_trunk_ver2.patch A bit improved version of patch attached. MapOperator now skips record if deserializer returned null. It makes deserializer plugin arch more flexible. If deserializer considers record as non-sense (corrupted, empty, whatever) it could simply return null and signal Hive not to consider it. > Policy on deserialization errors > > > Key: HIVE-1419 > URL: https://issues.apache.org/jira/browse/HIVE-1419 > Project: Hadoop Hive > Issue Type: Improvement > Components: Serializers/Deserializers >Affects Versions: 0.5.0 >Reporter: Vladimir Klimontovich >Assignee: Vladimir Klimontovich >Priority: Minor > Fix For: 0.5.1, 0.6.0 > > Attachments: corrupted_records_0.5.patch, > corrupted_records_0.5_ver2.patch, corrupted_records_trunk.patch, > corrupted_records_trunk_ver2.patch > > > When deserializer throws an exception the whole map tasks fails (see > MapOperator.java file). It's not always an convenient behavior especially on > huge datasets where several corrupted lines could be a normal practice. > Proposed solution: > 1) Have a counter of corrupted records > 2) When a counter exceeds a limit (configurable via > hive.max.deserializer.errors property, 0 by default) throw an exception. > Otherwise just log and exception with WARN level. > Patches for 0.5 branch and trunk are attached -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.