Syntax and Semantics for Continuous queries in Hive
Hello, I am curious whether large streaming data can be queried by syantax and semantics of continuous queries inside Hive as defined in Streams Databases (e.g. StreamBase, Coral8 etc.). Continuous queries are essential for real-time enterprise, log data and many other applications. Thanks, Shyam Sarkar
Can I specify a query in a test to see execution trace?
Hello, Is there a simple test where I can specify a query and see the execution trace under Eclipse Debug mode? Is there any test that interactively asks for a query? Thanks, shyam_sar...@yahoo.com
Call sequence
Hello, I am trying to understand the call sequence of Java classes from the top level (command processor). I can see Parser and Lexer generated by Antlr. Can someone please help me on the call sequence ? Where is the front level command processor that collects a query as a string and then calls parser? thanks, shyam_sar...@yahoo.com
Re: Call sequence
Thank you. --- On Wed, 3/11/09, Prasad Chakka pra...@facebook.com wrote: From: Prasad Chakka pra...@facebook.com Subject: Re: Call sequence To: hive-dev@hadoop.apache.org hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com shyam_sar...@yahoo.com Date: Wednesday, March 11, 2009, 3:11 PM Hi Shyam, I find Eclipse especially useful for these kind of things. Follow the instructions at http://wiki.apache.org/hadoop/Hive/GettingStarted/EclipseSetup and run a unit test in debug mode and observe the stack. Thanks, Prasad From: Ashish Thusoo athu...@facebook.com Reply-To: hive-dev@hadoop.apache.org Date: Wed, 11 Mar 2009 15:08:13 -0700 To: hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com Subject: RE: Call sequence That should be Driver.java. Look at the run() method. That calls ParseDriver.parse() to get the AST. And them sem.analyze() to do semantic analysis, optimization and plan generation. Finally it goes through the Task list and runs the tasks according to the dependencies. Ashish -Original Message- From: Shyam Sarkar [mailto:shyam_sar...@yahoo.com] Sent: Wednesday, March 11, 2009 2:49 PM To: hive-dev@hadoop.apache.org Subject: Call sequence Hello, I am trying to understand the call sequence of Java classes from the top level (command processor). I can see Parser and Lexer generated by Antlr. Can someone please help me on the call sequence ? Where is the front level command processor that collects a query as a string and then calls parser? thanks, shyam_sar...@yahoo.com
Please inspect TIMESTAMP design doc
All, Please inspect and leave some comments on TIMESTAMP design doc (mainly created from MySQL 6.0 spec). TIMESTAMP implementation can have impact on other parts of Hive code. So please let me know which specific syntax can be initially implemented. Thanks, shyam_sar...@yahoo.com
RE: Please check grammar for TIMESTAMP
Dear Ashish, Thanks for the comment. I found the following things from MySQL 6.0 :: (1) Inside CREATE TABLE, TIMESTAMP does not have any format. It is treated like a primitive type (string). (2) Inside SELECT clause, TIMESTAMP(MMDDHHMMSS) is called as a routine with format information for output spec. === MySQL 6.0 function === TIMESTAMP(expr), TIMESTAMP(expr1,expr2) With a single argument, this function returns the date or datetime expression expr as a datetime value. With two arguments, it adds the time expression expr2 to the date or datetime expression expr1 and returns the result as a datetime value. mysql SELECT TIMESTAMP('2003-12-31'); - '2003-12-31 00:00:00' mysql SELECT TIMESTAMP('2003-12-31 12:00:00','12:00:00'); - '2004-01-01 00:00:00' === As a result, we have to define TIMESTAMP as primitive type as well as a complex type with format information. I have to upgrade the grammar after further inspection. I am going to add a basic design document to JIRA. Please provide suggestions. Thanks, Shyam --- On Mon, 3/9/09, Ashish Thusoo athu...@facebook.com wrote: From: Ashish Thusoo athu...@facebook.com Subject: RE: Please check grammar for TIMESTAMP To: hive-dev@hadoop.apache.org hive-dev@hadoop.apache.org Date: Monday, March 9, 2009, 2:52 PM One immediate issue is that the format string is a lexical token, so a string of that format will not conform to the grammar at places where a string literal is expected. A better approach is to treat the format as a stringliteral and then do the format checks at the typecheck and semantic analysis time. Ashish -Original Message- From: Shyam Sarkar [mailto:shyam_sar...@yahoo.com] Sent: Sunday, March 08, 2009 7:16 AM To: hive-dev@hadoop.apache.org Subject: Please check grammar for TIMESTAMP Hi Zheng and others, Could you please check Hive.g grammar changes for TIMESTAMP (See the comments with // Change by Shyam)? Please review and let me know your feedback. I shall write a short design doc later for review after these short exchanges. Thanks, shyam_sar...@yahoo.com
Please check grammar for TIMESTAMP
Hi Zheng and others, Could you please check Hive.g grammar changes for TIMESTAMP (See the comments with // Change by Shyam)? Please review and let me know your feedback. I shall write a short design doc later for review after these short exchanges. Thanks, shyam_sar...@yahoo.com grammar Hive; options { output=AST; ASTLabelType=CommonTree; backtrack=true; k=1; } tokens { TOK_INSERT; TOK_QUERY; TOK_SELECT; TOK_SELECTDI; TOK_SELEXPR; TOK_FROM; TOK_TAB; TOK_PARTSPEC; TOK_PARTVAL; TOK_DIR; TOK_LOCAL_DIR; TOK_TABREF; TOK_SUBQUERY; TOK_DESTINATION; TOK_ALLCOLREF; TOK_COLREF; TOK_FUNCTION; TOK_FUNCTIONDI; TOK_WHERE; TOK_OP_EQ; TOK_OP_NE; TOK_OP_LE; TOK_OP_LT; TOK_OP_GE; TOK_OP_GT; TOK_OP_DIV; TOK_OP_ADD; TOK_OP_SUB; TOK_OP_MUL; TOK_OP_MOD; TOK_OP_BITAND; TOK_OP_BITNOT; TOK_OP_BITOR; TOK_OP_BITXOR; TOK_OP_AND; TOK_OP_OR; TOK_OP_NOT; TOK_OP_LIKE; TOK_TRUE; TOK_FALSE; TOK_TRANSFORM; TOK_EXPLIST; TOK_ALIASLIST; TOK_GROUPBY; TOK_ORDERBY; TOK_CLUSTERBY; TOK_DISTRIBUTEBY; TOK_SORTBY; TOK_UNION; TOK_JOIN; TOK_LEFTOUTERJOIN; TOK_RIGHTOUTERJOIN; TOK_FULLOUTERJOIN; TOK_LOAD; TOK_NULL; TOK_ISNULL; TOK_ISNOTNULL; TOK_TINYINT; TOK_SMALLINT; TOK_INT; TOK_BIGINT; TOK_BOOLEAN; TOK_FLOAT; TOK_DOUBLE; TOK_DATE; TOK_DATETIME; TOK_TIMESTAMP; TOK_STRING; TOK_LIST; TOK_MAP; TOK_CREATETABLE; TOK_DESCTABLE; TOK_ALTERTABLE_RENAME; TOK_ALTERTABLE_ADDCOLS; TOK_ALTERTABLE_REPLACECOLS; TOK_ALTERTABLE_ADDPARTS; TOK_ALTERTABLE_DROPPARTS; TOK_ALTERTABLE_SERDEPROPERTIES; TOK_ALTERTABLE_SERIALIZER; TOK_ALTERTABLE_PROPERTIES; TOK_MSCK; TOK_SHOWTABLES; TOK_SHOWPARTITIONS; TOK_CREATEEXTTABLE; TOK_DROPTABLE; TOK_TABCOLLIST; TOK_TABCOL; TOK_TABLECOMMENT; TOK_TABLEPARTCOLS; TOK_TABLEBUCKETS; TOK_TABLEROWFORMAT; TOK_TABLEROWFORMATFIELD; TOK_TABLEROWFORMATCOLLITEMS; TOK_TABLEROWFORMATMAPKEYS; TOK_TABLEROWFORMATLINES; TOK_TBLSEQUENCEFILE; TOK_TBLTEXTFILE; TOK_TABLEFILEFORMAT; TOK_TABCOLNAME; TOK_TABLELOCATION; TOK_PARTITIONLOCATION; TOK_TABLESAMPLE; TOK_TMP_FILE; TOK_TABSORTCOLNAMEASC; TOK_TABSORTCOLNAMEDESC; TOK_CHARSETLITERAL; TOK_CREATEFUNCTION; TOK_EXPLAIN; TOK_TABLESERIALIZER; TOK_TABLEPROPERTIES; TOK_TABLEPROPLIST; TOK_TABTYPE; TOK_LIMIT; TOK_TABLEPROPERTY; TOK_IFNOTEXISTS; } // Package headers @header { package org.apache.hadoop.hive.ql.parse; } @lexer::header {package org.apache.hadoop.hive.ql.parse;} @members { Stack msgs = new StackString(); } @rulecatch { catch (RecognitionException e) { reportError(e); throw e; } } // starting rule statement : explainStatement EOF | execStatement EOF ; explainStatement @init { msgs.push(explain statement); } @after { msgs.pop(); } : KW_EXPLAIN (isExtended=KW_EXTENDED)? execStatement - ^(TOK_EXPLAIN execStatement $isExtended?) ; execStatement @init { msgs.push(statement); } @after { msgs.pop(); } : queryStatementExpression | loadStatement | ddlStatement ; loadStatement @init { msgs.push(load statement); } @after { msgs.pop(); } : KW_LOAD KW_DATA (islocal=KW_LOCAL)? KW_INPATH (path=StringLiteral) (isoverwrite=KW_OVERWRITE)? KW_INTO KW_TABLE (tab=tabName) - ^(TOK_LOAD $path $tab $islocal? $isoverwrite?) ; ddlStatement @init { msgs.push(ddl statement); } @after { msgs.pop(); } : createStatement | dropStatement | alterStatement | descStatement | showStatement | metastoreCheck | createFunctionStatement ; ifNotExists @init { msgs.push(if not exists clause); } @after { msgs.pop(); } : KW_IF KW_NOT KW_EXISTS - ^(TOK_IFNOTEXISTS) ; createStatement @init { msgs.push(create statement); } @after { msgs.pop(); } : KW_CREATE (ext=KW_EXTERNAL)? KW_TABLE ifNotExists? name=Identifier (LPAREN columnNameTypeList RPAREN)? tableComment? tablePartition? tableBuckets? tableRowFormat? tableFileFormat? tableLocation? - {$ext == null}? ^(TOK_CREATETABLE $name ifNotExists? columnNameTypeList? tableComment? tablePartition? tableBuckets? tableRowFormat? tableFileFormat? tableLocation?) - ^(TOK_CREATEEXTTABLE $name ifNotExists? columnNameTypeList? tableComment? tablePartition? tableBuckets? tableRowFormat? tableFileFormat? tableLocation?) ; dropStatement @init { msgs.push(drop statement); } @after { msgs.pop(); } : KW_DROP KW_TABLE Identifier - ^(TOK_DROPTABLE Identifier) ; alterStatement @init { msgs.push(alter statement); } @after { msgs.pop(); } : alterStatementRename | alterStatementAddCol | alterStatementDropPartitions | alterStatementAddPartitions | alterStatementProperties | alterStatementSerdeProperties ; alterStatementRename @init { msgs.push(rename statement); } @after { msgs.pop(); } : KW_ALTER KW_TABLE oldName=Identifier KW_RENAME KW_TO newName=Identifier - ^(TOK_ALTERTABLE_RENAME $oldName $newName) ; alterStatementAddCol @init { msgs.push(add column statement); } @after { msgs.pop(); } : KW_ALTER KW_TABLE Identifier (add=KW_ADD | replace=KW_REPLACE) KW_COLUMNS LPAREN
Re: Please check grammar for TIMESTAMP
Yes there will be Timezone support. We shall follow MySQL 6.0 TIMESTAMP specification:: http://dev.mysql.com/doc/refman/6.0/en/timestamp.html Thanks, shyam_sar...@yahoo.com --- On Sun, 3/8/09, Tim Hawkins tim.hawk...@bejant.com wrote: From: Tim Hawkins tim.hawk...@bejant.com Subject: Re: Please check grammar for TIMESTAMP To: hive-dev@hadoop.apache.org Date: Sunday, March 8, 2009, 7:22 AM Is there going to be any Timezone Support?, ie will the time-stamp be stored in a recognised standard such as UTC regardless of the actual time submitted, given that hive/hadoop tend to be used for log processing and reporting in many use cases, understanding the normalising time-zone details may be nessacary, especially where you may have data sourced from multiple time zones. It may be worth considering this issue now as retrofitting it later may cause problems. On 8 Mar 2009, at 14:15, Shyam Sarkar wrote: Hi Zheng and others, Could you please check Hive.g grammar changes for TIMESTAMP (See the comments with // Change by Shyam)? Please review and let me know your feedback. I shall write a short design doc later for review after these short exchanges. Thanks, shyam_sar...@yahoo.com
TIMESTAMP type
Hello, I inspected the grammar Hive.g and decided to create a new type for TIMESTAMP. TIMESTAMP is not a primitive type or list type or map type. It is a timestamp type of the form TIMESTAMP(MMDDHHMMSS) which is different from other types. Please let me know if there is any other suggestions. Thanks, shyam_sar...@yahoo.com
Re: Need help on Hive.g and parser!
Thank you. I went through antlr. Just curious -- was there any comparison done between JavaCC and antlr ? How is the quality of code generated by antlr compared to JavaCC ? This could be an issue if in future we like to embed XML or java script inside Hive QL (not very important at this point). Advanced SQL syntax embeds XML and Java scripts. Thanks, Shyam --- On Tue, 2/17/09, Zheng Shao zsh...@gmail.com wrote: From: Zheng Shao zsh...@gmail.com Subject: Re: Need help on Hive.g and parser! To: hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com Date: Tuesday, February 17, 2009, 10:01 PM We are using antlr. Basically, the rule checks the timestamp of HiveParser.java. If it's newer than Hive.g, then we don't need to regenerate HiveParse.java from Hive.g again. Zheng On Tue, Feb 17, 2009 at 12:15 PM, Shyam Sarkar shyam_sar...@yahoo.comwrote: Hello, Someone please explain the following build.xml spec for grammar build (required and not required) :: === uptodate property=grammarBuild.notRequired srcfiles dir= ${src.dir}/org/apache/hadoop/hive/ql/parse includes=**/*.g/ mapper type=merge to=${build.dir.hive}/ql/gen-java/org/apache/hadoop/hive/ql/parse/HiveParser.java/ /uptodate target name=build-grammar unless=grammarBuild.notRequired echoBuilding Grammar ${src.dir}/org/apache/hadoop/hive/ql/parse/Hive.g /echo java classname=org.antlr.Tool classpathref=classpath fork=true arg value=-fo / arg value=${build.dir.hive}/ql/gen-java/org/apache/hadoop/hive/ql/parse / arg value=${src.dir}/org/apache/hadoop/hive/ql/parse/Hive.g / /java /target = Also can someone tell me which parser generator is used? I used JavaCC in the past. Thanks, shyam_sar...@yahoo.com -- Yours, Zheng
Need help on Hive.g and parser!
Hello, Someone please explain the following build.xml spec for grammar build (required and not required) :: === uptodate property=grammarBuild.notRequired srcfiles dir= ${src.dir}/org/apache/hadoop/hive/ql/parse includes=**/*.g/ mapper type=merge to=${build.dir.hive}/ql/gen-java/org/apache/hadoop/hive/ql/parse/HiveParser.java/ /uptodate target name=build-grammar unless=grammarBuild.notRequired echoBuilding Grammar ${src.dir}/org/apache/hadoop/hive/ql/parse/Hive.g /echo java classname=org.antlr.Tool classpathref=classpath fork=true arg value=-fo / arg value=${build.dir.hive}/ql/gen-java/org/apache/hadoop/hive/ql/parse / arg value=${src.dir}/org/apache/hadoop/hive/ql/parse/Hive.g / /java /target = Also can someone tell me which parser generator is used? I used JavaCC in the past. Thanks, shyam_sar...@yahoo.com
Server time zone !
Is there any 'Server's time zone' implementation inside Hive? For proper implementation of TIMESTAMP data type, this is necessay to translate from stored string type. I am focusing on MySQL 6.0 (with limited properties) for TIMESTAMP. http://dev.mysql.com/doc/refman/6.0/en/timestamp.html Thanks, Shyam
Need LOCALTIMESTAMP ?
Hello, Please help me to understand what I am going to implement for Timestamp. Do we need LOCALTIMESTAMP implementation? See the comparisons below:: = LOCALTIMESTAMP It's often important to get the value of current date and time. Below are the functions used to do that in the different implementations. Standard The current timestamp (without time zone) is retrieved with the LOCALTIMESTAMP function which may be used as: SELECT LOCALTIMESTAMP ... or SELECT LOCALTIMESTAMP(precision) ... Note that SELECT LOCALTIMESTAMP() ... is illegal: If you don't care about the precision, then you must not use any parenthesis. If the DBMS supports the non-core time zone features (feature ID F411), then it must also provide the functions CURRENT_TIMESTAMP and CURRENT_TIMESTAMP(precision) which return a value of type TIMESTAMP WITH TIME ZONE. If it doesn't support time zones, then the DBMS must not provide a CURRENT_TIMESTAMP function. PostgreSQL Follows the standard. Documentation DB2 Doesn't have the LOCALTIMESTAMP function. Instead, it provides a special, magic value ('special register' in IBM language), CURRENT_TIMESTAMP (alias to 'CURRENT TIMESTAMP') which may be used as though it were a function without arguments. However, since DB2 doesn't provide TIMESTAMP WITH TIME ZONE support, the availability of CURRENT_TIMESTAMP could be said to be against the standard—at least confusing. Documentation MSSQL Doesn't have the LOCALTIMESTAMP function. Instead, it has CURRENT_TIMESTAMP which—however—doesn't return a value of TIMESTAMP WITH TIME ZONE, but rather a value of MSSQL's DATETIME type (which doesn't contain time zone information). Documentation MySQL Follows the standard. Documentation Oracle Follows the standard. Informix On my TODO. Thanks, shyam_sar...@yahoo.com
Datetime type in SQL standard
Following is the BNF for datetime type in SQL 2003:: datetime type::= DATE | TIME [ left paren time precision right paren ] [ with or without time zone ] | TIMESTAMP [ left paren timestamp precision right paren ] [ with or without time zone ] Please let me know if we implement the standard completely or not. Thanks, shyam_sar...@yahoo.com
timestamp examples in standard SQL
Some examples with timestamp in SQL standard :: == Create Table CREATE TABLE Stu_Table ( Stu_Id varchar(2), Stu_Name varchar(10), Stu_Dob timestamp NOT NULL ); Insert Date Into Stu_Table Now insert into statement is used to add the records or rows into a table 'Stu_Table'. Insert Into Stu_Table Values('1', 'Komal', '1984-10-27'); Insert Into Stu_Table Values('2', 'ajay', '1985-04-19'); Insert Into Stu_Table Values('3', 'Santosh', '1986-11-16'); Stu_Table ++--+-+ | Stu_Id | Stu_Name | Stu_Dob | ++--+-+ | 1 | Komal| 1984-10-27 00:00:00 | | 2 | ajay | 1985-04-19 00:00:00 | | 3 | Santosh | 1986-11-16 00:00:00 | ++--+-+ Query The given below query return you the list of records enlisted in the select statement. The Where clause restrict the select query and return you the records from stu_Dob column between '1984-01-01' And '1986-1-1'. Select * From Stu_Table Where Stu_Dob Between '1984-01-01' And '1986-1-1'; Result ++--+-+ | Stu_Id | Stu_Name | Stu_Dob | ++--+-+ | 1 | Komal| 1984-10-27 00:00:00 | | 2 | ajay | 1985-04-19 00:00:00 | ++--+-+
RE: Need LOCALTIMESTAMP ?
Hi Ashish, Read about the latest TIMESTAMP implementation in MySQL 5.0 version and suggest :: http://dev.mysql.com/doc/refman/5.0/en/timestamp.html Also please comment on the following MySQL 5.0 implementation semantics:: TIMESTAMP values are converted from the current time zone to UTC for storage, and converted back from UTC to the current time zone for retrieval. (This occurs only for the TIMESTAMP data type, not for other types such as DATETIME.) By default, the current time zone for each connection is the server's time. Should we do the same thing? Thanks, shyam_sar...@yahoo.com --- On Wed, 2/11/09, Ashish Thusoo athu...@facebook.com wrote: From: Ashish Thusoo athu...@facebook.com Subject: RE: Need LOCALTIMESTAMP ? To: hive-dev@hadoop.apache.org hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com shyam_sar...@yahoo.com Date: Wednesday, February 11, 2009, 2:55 PM Hi Shyam, I think HIVE-192 is about the fact that there is no support for the timestamp type in Hive (or for that matter date and datetime types). In FB we are using strings to hold this information. If you are planning to add a built in function like localtimestamp, then that should probably go into a different JIRA. We have tried to adhere to mysql way of doing things as we find more folks using it (at least in our company) and looks from your research that they are basically standards compliant. So my vote will be to go with mysql semantics and CURRENT_TIMESTAMP construct. Ashish -Original Message- From: Shyam Sarkar [mailto:shyam_sar...@yahoo.com] Sent: Wednesday, February 11, 2009 2:37 PM To: hive-dev@hadoop.apache.org Subject: Need LOCALTIMESTAMP ? Hello, Please help me to understand what I am going to implement for Timestamp. Do we need LOCALTIMESTAMP implementation? See the comparisons below:: = LOCALTIMESTAMP It's often important to get the value of current date and time. Below are the functions used to do that in the different implementations. Standard The current timestamp (without time zone) is retrieved with the LOCALTIMESTAMP function which may be used as: SELECT LOCALTIMESTAMP ... or SELECT LOCALTIMESTAMP(precision) ... Note that SELECT LOCALTIMESTAMP() ... is illegal: If you don't care about the precision, then you must not use any parenthesis. If the DBMS supports the non-core time zone features (feature ID F411), then it must also provide the functions CURRENT_TIMESTAMP and CURRENT_TIMESTAMP(precision) which return a value of type TIMESTAMP WITH TIME ZONE. If it doesn't support time zones, then the DBMS must not provide a CURRENT_TIMESTAMP function. PostgreSQL Follows the standard. Documentation DB2 Doesn't have the LOCALTIMESTAMP function. Instead, it provides a special, magic value ('special register' in IBM language), CURRENT_TIMESTAMP (alias to 'CURRENT TIMESTAMP') which may be used as though it were a function without arguments. However, since DB2 doesn't provide TIMESTAMP WITH TIME ZONE support, the availability of CURRENT_TIMESTAMP could be said to be against the standard-at least confusing. Documentation MSSQL Doesn't have the LOCALTIMESTAMP function. Instead, it has CURRENT_TIMESTAMP which-however-doesn't return a value of TIMESTAMP WITH TIME ZONE, but rather a value of MSSQL's DATETIME type (which doesn't contain time zone information). Documentation MySQL Follows the standard. Documentation Oracle Follows the standard. Informix On my TODO. Thanks, shyam_sar...@yahoo.com
RE: Implementing Timestamp (HIVE-192)
Thank you all. I am looking forward to your help to solve various issues. Regards, Shyam --- On Tue, 2/10/09, Ashish Thusoo athu...@facebook.com wrote: From: Ashish Thusoo athu...@facebook.com Subject: RE: Implementing Timestamp (HIVE-192) To: hive-dev@hadoop.apache.org hive-dev@hadoop.apache.org Date: Tuesday, February 10, 2009, 11:01 AM Go for it Shyam... All of us are available for help on this mailing list and we do hop onto the irc channel from time to time Ashish -Original Message- From: Johan Oskarsson [mailto:jo...@oskarsson.nu] Sent: Tuesday, February 10, 2009 2:06 AM To: hive-dev@hadoop.apache.org Subject: Re: Implementing Timestamp (HIVE-192) The ticket hasn't been commented on so I'm going to assume that nobody is working on it, so please do go ahead and have a go at it. As for who you should talk to, someone on this list should probably jump in, but the committers are a good start: http://hadoop.apache.org/hive/credits.html There's also an IRC channel where some of them pop in now and again: ##hive at irc.freenode.net I haven't worked with serde much so can't give any pointers where to start. /Johan Shyam Sarkar wrote: Hello, I like to go deeper into Hive code (starting from parser all the way to file system) by implementing a small feature first. I like to add code for timestamp implementation (HIVE-192). Is that ok to implement? Is there any suggestion? Also I like to know who should I talk to for new feature suggestions for implementations in future (Is someone the main architect!!)? What is the process involved? Please let me know. Regards, shyam_sar...@yahoo.com Shyam Sundar Sarkar,Ph.D. Founder AyushNet 650-962-0900
HiveQL and SQL 2003
Hello, I am curious if certain object and type definition features of SQL 2003 standard can be implemented as part of HiveQL. It makes sense because hadoop database was designed as a non-SQL parallel database where operations are written in Java classes. Syntax and semantics for types and object packages within SQL syntax will make HiveQL much stronger. Regards, shyam_sar...@yahoo.com
RE: Eclipse run fails !!
Dear Ashish, For the last few days I tried eclipse 3.4.1 with 0.17.2.1 version and got the same errors with run-run. Then I looked into bin/hive command and found that it could not create table in HDFS. The reason was that I could not create /user/hive/warehouse directory inside HDFS. It was using Linux FS. This is why I switched to 0.19.0 where directories in HDFS can be created. Could you please tell me which exact version of hadoop will work fine with eclipse runs ? I want to get rid of errors in project itself (before any run). Regards, Shyam --- On Tue, 2/3/09, Ashish Thusoo athu...@facebook.com wrote: From: Ashish Thusoo athu...@facebook.com Subject: RE: Eclipse run fails !! To: hive-dev@hadoop.apache.org hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com shyam_sar...@yahoo.com Date: Tuesday, February 3, 2009, 11:38 AM Hi Shyam, We have not really tried the eclipse stuff for 0.19.0. Is it possible for you to use 0.17.0 for now, while we figure this out... Ashish -Original Message- From: Shyam Sarkar [mailto:shyam_sar...@yahoo.com] Sent: Tuesday, February 03, 2009 11:26 AM To: hive-dev@hadoop.apache.org Subject: Eclipse run fails !! Hello, I have hive project loaded inside eclipse 3.4.1 and hadoop 0.19.0 is running in the background. I could create tables from bin/hive command. But when I try to run-run inside eclipse it says:: Errors exist with required project(s): hive Proceed with launch ? and then it gives many errors. Can someone please tell me why there are errors in project hive ? I followed all steps correctly from hive wiki. Regards, shyam_sar...@yahoo.com
Build fails for eclipse-templates
Hello, I am a new developer for hive and hadoop. I downloaded hive and hadoop 0.17.2.1 version to test inside eclipse but my test failed. I could not create directory inside HDFS of 0.17.2.1 version but I could do so using 0.19.0 version. So I wanted to try eclipse compile/test with 0.19.0 and downloaded hive again. But now eclipse-templates build fails:: === [ssar...@ayush2 hive]$ /usr/java/apache-ant-1.7.1/bin/ant eclipse-templates/ -Dhadoop.version=0.19.0 Buildfile: build.xml BUILD FAILED Target eclipse-templates/ does not exist in the project hive. === Can someone tell me why the build fails with new hive checkouts ? Regards, shyam_sar...@yahoo.com
RE: Eclipse run fails !!
) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:494) at junit.framework.TestSuite.createTest(TestSuite.java:131) at junit.framework.TestSuite.addTestMethod(TestSuite.java:114) at junit.framework.TestSuite.init(TestSuite.java:75) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestLoader.getTest(JUnit3TestLoader.java:102) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestLoader.loadTests(JUnit3TestLoader.java:59) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:445) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196) java.lang.ExceptionInInitializerError at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:494) at junit.framework.TestSuite.createTest(TestSuite.java:131) at junit.framework.TestSuite.addTestMethod(TestSuite.java:114) at junit.framework.TestSuite.init(TestSuite.java:75) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestLoader.getTest(JUnit3TestLoader.java:102) at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestLoader.loadTests(JUnit3TestLoader.java:59) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:445) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386) at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196) Caused by: java.lang.RuntimeException: Encountered throwable at org.apache.hadoop.hive.ql.exec.TestExecDriver.clinit(TestExecDriver.java:113) ... 13 more == regards, Shyam --- On Tue, 2/3/09, Ashish Thusoo athu...@facebook.com wrote: From: Ashish Thusoo athu...@facebook.com Subject: RE: Eclipse run fails !! To: Shyam Sarkar shyam_sar...@yahoo.com, hive-dev@hadoop.apache.org hive-dev@hadoop.apache.org Date: Tuesday, February 3, 2009, 1:46 PM Hi Shyam, I can certainly say that 0.17.0 should work with eclipse. I have been doing that for a while. Maybe we can concentrate on fixing why you are not able to create a table in hdfs. I am not sure why you could not create the /user/hive/warehouse directory in 0.17. Are you saying that hadoop dfs -mkdir /user/facebook/hive does not work for you? Can you send out the output when you run this command. Ashish PS: using -Dhadoop.versoion=0.17.0 for all the commands that are given in the wiki should make things work in eclipse. -Original Message- From: Shyam Sarkar [mailto:shyam_sar...@yahoo.com] Sent: Tuesday, February 03, 2009 12:00 PM To: hive-dev@hadoop.apache.org; Ashish Thusoo Subject: RE: Eclipse run fails !! Dear Ashish, For the last few days I tried eclipse 3.4.1 with 0.17.2.1 version and got the same errors with run-run. Then I looked into bin/hive command and found that it could not create table in HDFS. The reason was that I could not create /user/hive/warehouse directory inside HDFS. It was using Linux FS. This is why I switched to 0.19.0 where directories in HDFS can be created. Could you please tell me which exact version of hadoop will work fine with eclipse runs ? I want to get rid of errors in project itself (before any run). Regards, Shyam --- On Tue, 2/3/09, Ashish Thusoo athu...@facebook.com wrote: From: Ashish Thusoo athu...@facebook.com Subject: RE: Eclipse run fails !! To: hive-dev@hadoop.apache.org hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com shyam_sar...@yahoo.com Date: Tuesday, February 3, 2009, 11:38 AM Hi Shyam, We have not really tried the eclipse stuff for 0.19.0. Is it possible for you to use 0.17.0 for now, while we figure this out... Ashish -Original Message- From: Shyam Sarkar [mailto:shyam_sar...@yahoo.com] Sent: Tuesday, February 03, 2009 11:26 AM To: hive-dev@hadoop.apache.org Subject: Eclipse run fails !! Hello, I have hive project loaded inside eclipse 3.4.1
Re: Need Help on Eclipse + JDK to load hive project !!
Would I lose any recent features if I do not use hadoop-0.19.0 version ? --- On Sat, 1/31/09, Prasad Chakka pra...@facebook.com wrote: From: Prasad Chakka pra...@facebook.com Subject: Re: Need Help on Eclipse + JDK to load hive project !! To: hive-dev@hadoop.apache.org hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com shyam_sar...@yahoo.com Date: Saturday, January 31, 2009, 11:52 AM I use Eclipse 3.4.0 (3.4.1 also works) with Java 6. Try the instruction from here http://wiki.apache.org/hadoop/Hive/GettingStarted/EclipseSetup I use hadoop-0.17.2.1 version instead of hadoop-0.19 since the former works well with running unit tests. From: Shyam Sarkar shyam_sar...@yahoo.com Reply-To: hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com Date: Sat, 31 Jan 2009 11:46:35 -0800 To: hive-dev@hadoop.apache.org Subject: Need Help on Eclipse + JDK to load hive project !! Hi, I have a linux box running Red Hat Linux 4. I tried with many eclipse and jdk versions but finally eclipse 3.2.1 worked with jdk1.5 java run time. I could not load source files from hive into a project. COULD SOMEONE PLEASE TELL ME THE FOLLOWING :: (1) Is the version for eclipse + jdk fine for hive development ? (2) How to load the whole hive project onto eclipse ? Regards, shyam_sar...@yahoo.com
Re: Need Help on Eclipse + JDK to load hive project !!
I am getting following error for build ant package :: = [ssar...@ayush2 hive]$ /usr/java/apache-ant-1.7.1/bin/ant package -Dhadoop.version=hadoop-0.17.2.1 Buildfile: build.xml deploy: init: download-ivy: init-ivy: settings-ivy: resolve: [ivy:retrieve] :: Ivy 2.0.0-rc2 - 20081028224207 :: http://ant.apache.org/ivy/ :: :: loading settings :: file = /home/ssarkar/hive/ivy/ivysettings.xml [ivy:retrieve] :: resolving dependencies :: org.apache.hadoop.hive#common;work...@ayush2 [ivy:retrieve] confs: [default] [ivy:retrieve] :: resolution report :: resolve 231ms :: artifacts dl 0ms - | |modules|| artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| - | default | 1 | 0 | 0 | 0 || 0 | 0 | - [ivy:retrieve] [ivy:retrieve] :: problems summary :: [ivy:retrieve] WARNINGS [ivy:retrieve] module not found: hadoop#core;hadoop-0.17.2.1 [ivy:retrieve] hadoop-resolver: tried [ivy:retrieve]-- artifact hadoop#core;hadoop-0.17.2.1!hadoop.tar.gz(source): [ivy:retrieve] http://archive.apache.org/dist/hadoop/core/hadoop-hadoop-0.17.2.1/hadoop-hadoop-0.17.2.1.tar.gz [ivy:retrieve] :: [ivy:retrieve] :: UNRESOLVED DEPENDENCIES :: [ivy:retrieve] :: [ivy:retrieve] :: hadoop#core;hadoop-0.17.2.1: not found [ivy:retrieve] :: [ivy:retrieve] [ivy:retrieve] :: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS BUILD FAILED /home/ssarkar/hive/build.xml:83: The following error occurred while executing this line: /home/ssarkar/hive/build-common.xml:83: impossible to resolve dependencies: resolve failed - see output for details Total time: 1 second [ssar...@ayush2 hive]$ --- On Sat, 1/31/09, Prasad Chakka pra...@facebook.com wrote: From: Prasad Chakka pra...@facebook.com Subject: Re: Need Help on Eclipse + JDK to load hive project !! To: hive-dev@hadoop.apache.org hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com shyam_sar...@yahoo.com Date: Saturday, January 31, 2009, 11:52 AM I use Eclipse 3.4.0 (3.4.1 also works) with Java 6. Try the instruction from here http://wiki.apache.org/hadoop/Hive/GettingStarted/EclipseSetup I use hadoop-0.17.2.1 version instead of hadoop-0.19 since the former works well with running unit tests. From: Shyam Sarkar shyam_sar...@yahoo.com Reply-To: hive-dev@hadoop.apache.org, shyam_sar...@yahoo.com Date: Sat, 31 Jan 2009 11:46:35 -0800 To: hive-dev@hadoop.apache.org Subject: Need Help on Eclipse + JDK to load hive project !! Hi, I have a linux box running Red Hat Linux 4. I tried with many eclipse and jdk versions but finally eclipse 3.2.1 worked with jdk1.5 java run time. I could not load source files from hive into a project. COULD SOMEONE PLEASE TELL ME THE FOLLOWING :: (1) Is the version for eclipse + jdk fine for hive development ? (2) How to load the whole hive project onto eclipse ? Regards, shyam_sar...@yahoo.com