RE: Please check grammar for TIMESTAMP
One immediate issue is that the format string is a lexical token, so a string of that format will not conform to the grammar at places where a string literal is expected. A better approach is to treat the format as a stringliteral and then do the format checks at the typecheck and semantic analysis time. Ashish -Original Message- From: Shyam Sarkar [mailto:shyam_sar...@yahoo.com] Sent: Sunday, March 08, 2009 7:16 AM To: hive-dev@hadoop.apache.org Subject: Please check grammar for TIMESTAMP Hi Zheng and others, Could you please check Hive.g grammar changes for TIMESTAMP (See the comments with // Change by Shyam)? Please review and let me know your feedback. I shall write a short design doc later for review after these short exchanges. Thanks, shyam_sar...@yahoo.com
RE: Please check grammar for TIMESTAMP
Dear Ashish, Thanks for the comment. I found the following things from MySQL 6.0 :: (1) Inside CREATE TABLE, TIMESTAMP does not have any format. It is treated like a primitive type (string). (2) Inside SELECT clause, TIMESTAMP(MMDDHHMMSS) is called as a routine with format information for output spec. === MySQL 6.0 function === TIMESTAMP(expr), TIMESTAMP(expr1,expr2) With a single argument, this function returns the date or datetime expression expr as a datetime value. With two arguments, it adds the time expression expr2 to the date or datetime expression expr1 and returns the result as a datetime value. mysql SELECT TIMESTAMP('2003-12-31'); - '2003-12-31 00:00:00' mysql SELECT TIMESTAMP('2003-12-31 12:00:00','12:00:00'); - '2004-01-01 00:00:00' === As a result, we have to define TIMESTAMP as primitive type as well as a complex type with format information. I have to upgrade the grammar after further inspection. I am going to add a basic design document to JIRA. Please provide suggestions. Thanks, Shyam --- On Mon, 3/9/09, Ashish Thusoo athu...@facebook.com wrote: From: Ashish Thusoo athu...@facebook.com Subject: RE: Please check grammar for TIMESTAMP To: hive-dev@hadoop.apache.org hive-dev@hadoop.apache.org Date: Monday, March 9, 2009, 2:52 PM One immediate issue is that the format string is a lexical token, so a string of that format will not conform to the grammar at places where a string literal is expected. A better approach is to treat the format as a stringliteral and then do the format checks at the typecheck and semantic analysis time. Ashish -Original Message- From: Shyam Sarkar [mailto:shyam_sar...@yahoo.com] Sent: Sunday, March 08, 2009 7:16 AM To: hive-dev@hadoop.apache.org Subject: Please check grammar for TIMESTAMP Hi Zheng and others, Could you please check Hive.g grammar changes for TIMESTAMP (See the comments with // Change by Shyam)? Please review and let me know your feedback. I shall write a short design doc later for review after these short exchanges. Thanks, shyam_sar...@yahoo.com
Please check grammar for TIMESTAMP
Hi Zheng and others, Could you please check Hive.g grammar changes for TIMESTAMP (See the comments with // Change by Shyam)? Please review and let me know your feedback. I shall write a short design doc later for review after these short exchanges. Thanks, shyam_sar...@yahoo.com grammar Hive; options { output=AST; ASTLabelType=CommonTree; backtrack=true; k=1; } tokens { TOK_INSERT; TOK_QUERY; TOK_SELECT; TOK_SELECTDI; TOK_SELEXPR; TOK_FROM; TOK_TAB; TOK_PARTSPEC; TOK_PARTVAL; TOK_DIR; TOK_LOCAL_DIR; TOK_TABREF; TOK_SUBQUERY; TOK_DESTINATION; TOK_ALLCOLREF; TOK_COLREF; TOK_FUNCTION; TOK_FUNCTIONDI; TOK_WHERE; TOK_OP_EQ; TOK_OP_NE; TOK_OP_LE; TOK_OP_LT; TOK_OP_GE; TOK_OP_GT; TOK_OP_DIV; TOK_OP_ADD; TOK_OP_SUB; TOK_OP_MUL; TOK_OP_MOD; TOK_OP_BITAND; TOK_OP_BITNOT; TOK_OP_BITOR; TOK_OP_BITXOR; TOK_OP_AND; TOK_OP_OR; TOK_OP_NOT; TOK_OP_LIKE; TOK_TRUE; TOK_FALSE; TOK_TRANSFORM; TOK_EXPLIST; TOK_ALIASLIST; TOK_GROUPBY; TOK_ORDERBY; TOK_CLUSTERBY; TOK_DISTRIBUTEBY; TOK_SORTBY; TOK_UNION; TOK_JOIN; TOK_LEFTOUTERJOIN; TOK_RIGHTOUTERJOIN; TOK_FULLOUTERJOIN; TOK_LOAD; TOK_NULL; TOK_ISNULL; TOK_ISNOTNULL; TOK_TINYINT; TOK_SMALLINT; TOK_INT; TOK_BIGINT; TOK_BOOLEAN; TOK_FLOAT; TOK_DOUBLE; TOK_DATE; TOK_DATETIME; TOK_TIMESTAMP; TOK_STRING; TOK_LIST; TOK_MAP; TOK_CREATETABLE; TOK_DESCTABLE; TOK_ALTERTABLE_RENAME; TOK_ALTERTABLE_ADDCOLS; TOK_ALTERTABLE_REPLACECOLS; TOK_ALTERTABLE_ADDPARTS; TOK_ALTERTABLE_DROPPARTS; TOK_ALTERTABLE_SERDEPROPERTIES; TOK_ALTERTABLE_SERIALIZER; TOK_ALTERTABLE_PROPERTIES; TOK_MSCK; TOK_SHOWTABLES; TOK_SHOWPARTITIONS; TOK_CREATEEXTTABLE; TOK_DROPTABLE; TOK_TABCOLLIST; TOK_TABCOL; TOK_TABLECOMMENT; TOK_TABLEPARTCOLS; TOK_TABLEBUCKETS; TOK_TABLEROWFORMAT; TOK_TABLEROWFORMATFIELD; TOK_TABLEROWFORMATCOLLITEMS; TOK_TABLEROWFORMATMAPKEYS; TOK_TABLEROWFORMATLINES; TOK_TBLSEQUENCEFILE; TOK_TBLTEXTFILE; TOK_TABLEFILEFORMAT; TOK_TABCOLNAME; TOK_TABLELOCATION; TOK_PARTITIONLOCATION; TOK_TABLESAMPLE; TOK_TMP_FILE; TOK_TABSORTCOLNAMEASC; TOK_TABSORTCOLNAMEDESC; TOK_CHARSETLITERAL; TOK_CREATEFUNCTION; TOK_EXPLAIN; TOK_TABLESERIALIZER; TOK_TABLEPROPERTIES; TOK_TABLEPROPLIST; TOK_TABTYPE; TOK_LIMIT; TOK_TABLEPROPERTY; TOK_IFNOTEXISTS; } // Package headers @header { package org.apache.hadoop.hive.ql.parse; } @lexer::header {package org.apache.hadoop.hive.ql.parse;} @members { Stack msgs = new StackString(); } @rulecatch { catch (RecognitionException e) { reportError(e); throw e; } } // starting rule statement : explainStatement EOF | execStatement EOF ; explainStatement @init { msgs.push(explain statement); } @after { msgs.pop(); } : KW_EXPLAIN (isExtended=KW_EXTENDED)? execStatement - ^(TOK_EXPLAIN execStatement $isExtended?) ; execStatement @init { msgs.push(statement); } @after { msgs.pop(); } : queryStatementExpression | loadStatement | ddlStatement ; loadStatement @init { msgs.push(load statement); } @after { msgs.pop(); } : KW_LOAD KW_DATA (islocal=KW_LOCAL)? KW_INPATH (path=StringLiteral) (isoverwrite=KW_OVERWRITE)? KW_INTO KW_TABLE (tab=tabName) - ^(TOK_LOAD $path $tab $islocal? $isoverwrite?) ; ddlStatement @init { msgs.push(ddl statement); } @after { msgs.pop(); } : createStatement | dropStatement | alterStatement | descStatement | showStatement | metastoreCheck | createFunctionStatement ; ifNotExists @init { msgs.push(if not exists clause); } @after { msgs.pop(); } : KW_IF KW_NOT KW_EXISTS - ^(TOK_IFNOTEXISTS) ; createStatement @init { msgs.push(create statement); } @after { msgs.pop(); } : KW_CREATE (ext=KW_EXTERNAL)? KW_TABLE ifNotExists? name=Identifier (LPAREN columnNameTypeList RPAREN)? tableComment? tablePartition? tableBuckets? tableRowFormat? tableFileFormat? tableLocation? - {$ext == null}? ^(TOK_CREATETABLE $name ifNotExists? columnNameTypeList? tableComment? tablePartition? tableBuckets? tableRowFormat? tableFileFormat? tableLocation?) - ^(TOK_CREATEEXTTABLE $name ifNotExists? columnNameTypeList? tableComment? tablePartition? tableBuckets? tableRowFormat? tableFileFormat? tableLocation?) ; dropStatement @init { msgs.push(drop statement); } @after { msgs.pop(); } : KW_DROP KW_TABLE Identifier - ^(TOK_DROPTABLE Identifier) ; alterStatement @init { msgs.push(alter statement); } @after { msgs.pop(); } : alterStatementRename | alterStatementAddCol | alterStatementDropPartitions | alterStatementAddPartitions | alterStatementProperties | alterStatementSerdeProperties ; alterStatementRename @init { msgs.push(rename statement); } @after { msgs.pop(); } : KW_ALTER KW_TABLE oldName=Identifier KW_RENAME KW_TO newName=Identifier - ^(TOK_ALTERTABLE_RENAME $oldName $newName) ; alterStatementAddCol @init { msgs.push(add column statement); } @after { msgs.pop(); } : KW_ALTER KW_TABLE Identifier (add=KW_ADD | replace=KW_REPLACE) KW_COLUMNS LPAREN
Re: Please check grammar for TIMESTAMP
Is there going to be any Timezone Support?, ie will the time-stamp be stored in a recognised standard such as UTC regardless of the actual time submitted, given that hive/hadoop tend to be used for log processing and reporting in many use cases, understanding the normalising time-zone details may be nessacary, especially where you may have data sourced from multiple time zones. It may be worth considering this issue now as retrofitting it later may cause problems. On 8 Mar 2009, at 14:15, Shyam Sarkar wrote: Hi Zheng and others, Could you please check Hive.g grammar changes for TIMESTAMP (See the comments with // Change by Shyam)? Please review and let me know your feedback. I shall write a short design doc later for review after these short exchanges. Thanks, shyam_sar...@yahoo.com grammar Hive; options { output=AST; ASTLabelType=CommonTree; backtrack=true; k=1; } tokens { TOK_INSERT; TOK_QUERY; TOK_SELECT; TOK_SELECTDI; TOK_SELEXPR; TOK_FROM; TOK_TAB; TOK_PARTSPEC; TOK_PARTVAL; TOK_DIR; TOK_LOCAL_DIR; TOK_TABREF; TOK_SUBQUERY; TOK_DESTINATION; TOK_ALLCOLREF; TOK_COLREF; TOK_FUNCTION; TOK_FUNCTIONDI; TOK_WHERE; TOK_OP_EQ; TOK_OP_NE; TOK_OP_LE; TOK_OP_LT; TOK_OP_GE; TOK_OP_GT; TOK_OP_DIV; TOK_OP_ADD; TOK_OP_SUB; TOK_OP_MUL; TOK_OP_MOD; TOK_OP_BITAND; TOK_OP_BITNOT; TOK_OP_BITOR; TOK_OP_BITXOR; TOK_OP_AND; TOK_OP_OR; TOK_OP_NOT; TOK_OP_LIKE; TOK_TRUE; TOK_FALSE; TOK_TRANSFORM; TOK_EXPLIST; TOK_ALIASLIST; TOK_GROUPBY; TOK_ORDERBY; TOK_CLUSTERBY; TOK_DISTRIBUTEBY; TOK_SORTBY; TOK_UNION; TOK_JOIN; TOK_LEFTOUTERJOIN; TOK_RIGHTOUTERJOIN; TOK_FULLOUTERJOIN; TOK_LOAD; TOK_NULL; TOK_ISNULL; TOK_ISNOTNULL; TOK_TINYINT; TOK_SMALLINT; TOK_INT; TOK_BIGINT; TOK_BOOLEAN; TOK_FLOAT; TOK_DOUBLE; TOK_DATE; TOK_DATETIME; TOK_TIMESTAMP; TOK_STRING; TOK_LIST; TOK_MAP; TOK_CREATETABLE; TOK_DESCTABLE; TOK_ALTERTABLE_RENAME; TOK_ALTERTABLE_ADDCOLS; TOK_ALTERTABLE_REPLACECOLS; TOK_ALTERTABLE_ADDPARTS; TOK_ALTERTABLE_DROPPARTS; TOK_ALTERTABLE_SERDEPROPERTIES; TOK_ALTERTABLE_SERIALIZER; TOK_ALTERTABLE_PROPERTIES; TOK_MSCK; TOK_SHOWTABLES; TOK_SHOWPARTITIONS; TOK_CREATEEXTTABLE; TOK_DROPTABLE; TOK_TABCOLLIST; TOK_TABCOL; TOK_TABLECOMMENT; TOK_TABLEPARTCOLS; TOK_TABLEBUCKETS; TOK_TABLEROWFORMAT; TOK_TABLEROWFORMATFIELD; TOK_TABLEROWFORMATCOLLITEMS; TOK_TABLEROWFORMATMAPKEYS; TOK_TABLEROWFORMATLINES; TOK_TBLSEQUENCEFILE; TOK_TBLTEXTFILE; TOK_TABLEFILEFORMAT; TOK_TABCOLNAME; TOK_TABLELOCATION; TOK_PARTITIONLOCATION; TOK_TABLESAMPLE; TOK_TMP_FILE; TOK_TABSORTCOLNAMEASC; TOK_TABSORTCOLNAMEDESC; TOK_CHARSETLITERAL; TOK_CREATEFUNCTION; TOK_EXPLAIN; TOK_TABLESERIALIZER; TOK_TABLEPROPERTIES; TOK_TABLEPROPLIST; TOK_TABTYPE; TOK_LIMIT; TOK_TABLEPROPERTY; TOK_IFNOTEXISTS; } // Package headers @header { package org.apache.hadoop.hive.ql.parse; } @lexer::header {package org.apache.hadoop.hive.ql.parse;} @members { Stack msgs = new StackString(); } @rulecatch { catch (RecognitionException e) { reportError(e); throw e; } } // starting rule statement : explainStatement EOF | execStatement EOF ; explainStatement @init { msgs.push(explain statement); } @after { msgs.pop(); } : KW_EXPLAIN (isExtended=KW_EXTENDED)? execStatement - ^(TOK_EXPLAIN execStatement $isExtended?) ; execStatement @init { msgs.push(statement); } @after { msgs.pop(); } : queryStatementExpression | loadStatement | ddlStatement ; loadStatement @init { msgs.push(load statement); } @after { msgs.pop(); } : KW_LOAD KW_DATA (islocal=KW_LOCAL)? KW_INPATH (path=StringLiteral) (isoverwrite=KW_OVERWRITE)? KW_INTO KW_TABLE (tab=tabName) - ^(TOK_LOAD $path $tab $islocal? $isoverwrite?) ; ddlStatement @init { msgs.push(ddl statement); } @after { msgs.pop(); } : createStatement | dropStatement | alterStatement | descStatement | showStatement | metastoreCheck | createFunctionStatement ; ifNotExists @init { msgs.push(if not exists clause); } @after { msgs.pop(); } : KW_IF KW_NOT KW_EXISTS - ^(TOK_IFNOTEXISTS) ; createStatement @init { msgs.push(create statement); } @after { msgs.pop(); } : KW_CREATE (ext=KW_EXTERNAL)? KW_TABLE ifNotExists? name=Identifier (LPAREN columnNameTypeList RPAREN)? tableComment? tablePartition? tableBuckets? tableRowFormat? tableFileFormat? tableLocation? - {$ext == null}? ^(TOK_CREATETABLE $name ifNotExists? columnNameTypeList? tableComment? tablePartition? tableBuckets? tableRowFormat? tableFileFormat? tableLocation?) - ^(TOK_CREATEEXTTABLE $name ifNotExists? columnNameTypeList? tableComment? tablePartition? tableBuckets? tableRowFormat? tableFileFormat? tableLocation?) ; dropStatement @init { msgs.push(drop statement); } @after { msgs.pop(); } : KW_DROP KW_TABLE Identifier - ^(TOK_DROPTABLE Identifier) ; alterStatement @init { msgs.push(alter statement); } @after { msgs.pop(); } : alterStatementRename | alterStatementAddCol |
Re: Please check grammar for TIMESTAMP
Yes there will be Timezone support. We shall follow MySQL 6.0 TIMESTAMP specification:: http://dev.mysql.com/doc/refman/6.0/en/timestamp.html Thanks, shyam_sar...@yahoo.com --- On Sun, 3/8/09, Tim Hawkins tim.hawk...@bejant.com wrote: From: Tim Hawkins tim.hawk...@bejant.com Subject: Re: Please check grammar for TIMESTAMP To: hive-dev@hadoop.apache.org Date: Sunday, March 8, 2009, 7:22 AM Is there going to be any Timezone Support?, ie will the time-stamp be stored in a recognised standard such as UTC regardless of the actual time submitted, given that hive/hadoop tend to be used for log processing and reporting in many use cases, understanding the normalising time-zone details may be nessacary, especially where you may have data sourced from multiple time zones. It may be worth considering this issue now as retrofitting it later may cause problems. On 8 Mar 2009, at 14:15, Shyam Sarkar wrote: Hi Zheng and others, Could you please check Hive.g grammar changes for TIMESTAMP (See the comments with // Change by Shyam)? Please review and let me know your feedback. I shall write a short design doc later for review after these short exchanges. Thanks, shyam_sar...@yahoo.com