[jira] Commented: (HIVE-1852) Reduce unnecessary DFSClient.rename() calls

2010-12-21 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973519#action_12973519
 ] 

He Yongqiang commented on HIVE-1852:


@Joy, the FsShell delete is there because FB's hadoop is doing a slightly 
different implementation for the FileSystem.delete. 

> Reduce unnecessary DFSClient.rename() calls
> ---
>
> Key: HIVE-1852
> URL: https://issues.apache.org/jira/browse/HIVE-1852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1852.2.patch, HIVE-1852.3.patch, HIVE-1852.patch
>
>
> In Hive client side (MoveTask etc), DFSCleint.rename() is called for every 
> file inside a directory. It is very expensive for a large directory in a busy 
> DFS namenode. We should replace it with a single rename() call on the whole 
> directory. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1852) Reduce unnecessary DFSClient.rename() calls

2010-12-21 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1852:
-

Attachment: HIVE-1852.4.patch

@joydeep, you are right. if there is no 'local' keyword there is no CopyTask 
and MoveTask takes wildcards. I'm uploading a new patch address wildcards. Also 
reverted to FsShell to use oldPath for the reason Yongqiang mentioned. 

I'm running unit tests.

> Reduce unnecessary DFSClient.rename() calls
> ---
>
> Key: HIVE-1852
> URL: https://issues.apache.org/jira/browse/HIVE-1852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1852.2.patch, HIVE-1852.3.patch, HIVE-1852.4.patch, 
> HIVE-1852.patch
>
>
> In Hive client side (MoveTask etc), DFSCleint.rename() is called for every 
> file inside a directory. It is very expensive for a large directory in a busy 
> DFS namenode. We should replace it with a single rename() call on the whole 
> directory. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1847) option of continue on error

2010-12-21 Thread Thiruvel Thirumoolan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973652#action_12973652
 ] 

Thiruvel Thirumoolan commented on HIVE-1847:


hive.cli.errors.ignore?

Its set to 'false' by default and after setting it to 'true', all commands in 
the script are executed, no matter how many failed.

> option of continue on error
> ---
>
> Key: HIVE-1847
> URL: https://issues.apache.org/jira/browse/HIVE-1847
> Project: Hive
>  Issue Type: Improvement
>Reporter: Namit Jain
>
> In "hive -f 

facebook friends relationship data

2010-12-21 Thread xintao

Dear all,

I read this webpage 
(http://www.facebook.com/notes/facebook-engineering/visualizing-friendships/469716398919). 
Does it mean we can get the Facebook friendship data from here?


Any of your reply is highly replied.

Best regards,

Xintao 



[jira] Updated: (HIVE-1852) Reduce unnecessary DFSClient.rename() calls

2010-12-21 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1852:
-

Attachment: HIVE-1852.5.patch

Uploading HIVE-1852.5.patch containing a simple log fix in load_overwrite.q.

> Reduce unnecessary DFSClient.rename() calls
> ---
>
> Key: HIVE-1852
> URL: https://issues.apache.org/jira/browse/HIVE-1852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1852.2.patch, HIVE-1852.3.patch, HIVE-1852.4.patch, 
> HIVE-1852.5.patch, HIVE-1852.patch
>
>
> In Hive client side (MoveTask etc), DFSCleint.rename() is called for every 
> file inside a directory. It is very expensive for a large directory in a busy 
> DFS namenode. We should replace it with a single rename() call on the whole 
> directory. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1696) Add delegation token support to metastore

2010-12-21 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HIVE-1696:
--

Attachment: hive-1696-1.patch

Patch with a little bit of refactoring on the previous patch. I also removed 
the code for checking whether a client is kerberos authenticated in the 
getDelegationToken and renewDelegationToken methods. That requires some changes 
in a Hadoop security API to have a cleaner implementation. We can do that in a 
follow up jira.

> Add delegation token support to metastore
> -
>
> Key: HIVE-1696
> URL: https://issues.apache.org/jira/browse/HIVE-1696
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Security, Server Infrastructure
>Reporter: Todd Lipcon
> Fix For: 0.7.0
>
> Attachments: hive-1696-1-with-gen-code.patch, hive-1696-1.patch, 
> hive_1696.patch, hive_1696.patch, hive_1696_no-thrift.patch
>
>
> As discussed in HIVE-842, kerberos authentication is only sufficient for 
> authentication of a hive user client to the metastore. There are other cases 
> where thrift calls need to be authenticated when the caller is running in an 
> environment without kerberos credentials. For example, an MR task running as 
> part of a hive job may want to report statistics to the metastore, or a job 
> may be running within the context of Oozie or Hive Server.
> This JIRA is to implement support of delegation tokens for the metastore. The 
> concept of a delegation token is borrowed from the Hadoop security design - 
> the quick summary is that a kerberos-authenticated client may retrieve a 
> binary token from the server. This token can then be passed to other clients 
> which can use it to achieve authentication as the original user in lieu of a 
> kerberos ticket.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1696) Add delegation token support to metastore

2010-12-21 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HIVE-1696:
--

Attachment: hive-1696-1-with-gen-code.patch

This is with the generated code.

> Add delegation token support to metastore
> -
>
> Key: HIVE-1696
> URL: https://issues.apache.org/jira/browse/HIVE-1696
> Project: Hive
>  Issue Type: Sub-task
>  Components: Metastore, Security, Server Infrastructure
>Reporter: Todd Lipcon
> Fix For: 0.7.0
>
> Attachments: hive-1696-1-with-gen-code.patch, hive-1696-1.patch, 
> hive_1696.patch, hive_1696.patch, hive_1696_no-thrift.patch
>
>
> As discussed in HIVE-842, kerberos authentication is only sufficient for 
> authentication of a hive user client to the metastore. There are other cases 
> where thrift calls need to be authenticated when the caller is running in an 
> environment without kerberos credentials. For example, an MR task running as 
> part of a hive job may want to report statistics to the metastore, or a job 
> may be running within the context of Oozie or Hive Server.
> This JIRA is to implement support of delegation tokens for the metastore. The 
> concept of a delegation token is borrowed from the Hadoop security design - 
> the quick summary is that a kerberos-authenticated client may retrieve a 
> binary token from the server. This token can then be passed to other clients 
> which can use it to achieve authentication as the original user in lieu of a 
> kerberos ticket.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1856) Implement DROP TABLE/VIEW ... IF EXISTS

2010-12-21 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973809#action_12973809
 ] 

John Sichi commented on HIVE-1856:
--

+1.  Will commit when tests pass.

Note for next time:  for patch updates, our convention is to number them like 
HIVE-1856.1.patch, HIVE-1856.2.patch, etc.  And then click the "Submit Patch" 
button again when a new one is uploaded; this makes sure it gets back into the 
review queue.


> Implement DROP TABLE/VIEW ... IF EXISTS 
> 
>
> Key: HIVE-1856
> URL: https://issues.apache.org/jira/browse/HIVE-1856
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.7.0
>Reporter: Marcel Kornacker
>Assignee: Marcel Kornacker
> Attachments: hive-1856.patch, hive-1856.patch
>
>
> This issue combines issues HIVE-1550/1165/1542/1551:
> - augment DROP TABLE/VIEW with IF EXISTS
> - signal an error if the table/view doesn't exist and IF EXISTS wasn't 
> specified
> - introduce a flag in the configuration that allows you to turn off the new 
> behavior

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Hive-trunk-h0.20 #449

2010-12-21 Thread Apache Hudson Server
See 

Changes:

[pauly] HIVE-1857 mixed case tablename on lefthand side of LATERAL VIEW results 
in
query failing with confusing error message (John Sichi via pauly)

[namit] HIVE-1855 Include Process ID in the log4j log file name
(Ning Zhang via namit)

[nzhang] HIVE-1835. Better auto-complete for Hive (Paul Butler via Ning Zhang)

[jssarma] commit second diff for hive-1846 (rvadali via jssarma)

[jssarma] Reversing erroneous commit

[jssarma] commit second diff for hive-1846 (rvadali via jssarma)

[namit] HIVE-1854 Temporarily disable metastore tests for 
listPartitionsByFilter()
(Paul Yang via namit)

--
[...truncated 15104 lines...]
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Copying data from 

[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Copying data from 

[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Copying data from 

[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 

 

[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Copying data from 

[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] rmr: cannot remove 
p:
 No such file or directory.
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Copying data from 

[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] rmr: cannot remove 
p:
 No such file or directory.
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Copying data from 

[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] rmr: cannot remove 
p:
 No such file or directory.
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Copying data from 

[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] rmr: cannot remove 
p:
 No such file or directory.
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 

[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 

[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 

[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 

[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 

[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 


[jira] Commented: (HIVE-1853) downgrade JDO version

2010-12-21 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1853?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973847#action_12973847
 ] 

Namit Jain commented on HIVE-1853:
--

Unfortunately, the query that I was running used some production tables.
I will try to reproduce the query with some non-production tables.

> downgrade JDO version
> -
>
> Key: HIVE-1853
> URL: https://issues.apache.org/jira/browse/HIVE-1853
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.7.0
>Reporter: Namit Jain
>Assignee: Paul Yang
> Fix For: 0.7.0
>
> Attachments: HIVE-1853.1.patch, HIVE-1853.2.patch
>
>
> After HIVE-1609, we are seeing some table not found errors intermittently.
> We have a test case where 5 processes are concurrently issueing the same 
> query - 
> explain extended insert .. select from 
> and once in a while, we get a error  not found - 
> When we revert back the JDO version, the error is gone.
> We can investigate later to find the JDO bug, but for now this is a 
> show-stopper for facebook, and needs
> to be reverted back immediately.
> This also means, that the filters will not be pushed to mysql.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1852) Reduce unnecessary DFSClient.rename() calls

2010-12-21 Thread Joydeep Sen Sarma (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973883#action_12973883
 ] 

Joydeep Sen Sarma commented on HIVE-1852:
-

Hive..java:1564 - this should read fs.rename(srcs[0]) (since srcf may have been 
a wildcard that matched a single dir)
Hive.java:1574 - we can optimize this loop i think. if the wildcard does not 
match a single directory - then it has to match a set of files. the 
loadsemantic analyzer already enforces this. so we don't need a second 
listStatus and loop over the entries here. can directly move each of the srcs 
into destf/srcs.getName

we have lost the atomic move for the wildcard case. i think it's ok (it's not 
used much i would imagine) - at least leave a note/todo saying that this would 
be nice to have atomic.

new tests look pretty good to me - the load/move case with wildcards is getting 
covered. we could add one where the load path is a wildcard that matches a 
single dir to cover the first comment here.


> Reduce unnecessary DFSClient.rename() calls
> ---
>
> Key: HIVE-1852
> URL: https://issues.apache.org/jira/browse/HIVE-1852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1852.2.patch, HIVE-1852.3.patch, HIVE-1852.4.patch, 
> HIVE-1852.5.patch, HIVE-1852.patch
>
>
> In Hive client side (MoveTask etc), DFSCleint.rename() is called for every 
> file inside a directory. It is very expensive for a large directory in a busy 
> DFS namenode. We should replace it with a single rename() call on the whole 
> directory. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1806) The merge criteria on dynamic partitons should be per partiton

2010-12-21 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1806:
-

Attachment: HIVE-1806.2.patch

Uploading a new patch that resolved the diff in bucketmapjoin2.q. I'm rerunning 
unit tests. 

> The merge criteria on dynamic partitons should be per partiton
> --
>
> Key: HIVE-1806
> URL: https://issues.apache.org/jira/browse/HIVE-1806
> Project: Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1806.2.patch, HIVE-1806.patch
>
>
> Currently the criteria of whether a merge job should be fired on dynamic 
> generated partitions are is the average file size of files across all dynamic 
> partitions. It is very common that some dynamic partitions contains mostly 
> large files and some contains mostly small files. Even though the average 
> size of the total files are larger than the hive.merge.smallfiles.avgsize, we 
> should merge those partitions containing small files only. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1806) The merge criteria on dynamic partitons should be per partiton

2010-12-21 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1806:
-

Status: Patch Available  (was: Open)

> The merge criteria on dynamic partitons should be per partiton
> --
>
> Key: HIVE-1806
> URL: https://issues.apache.org/jira/browse/HIVE-1806
> Project: Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1806.2.patch, HIVE-1806.patch
>
>
> Currently the criteria of whether a merge job should be fired on dynamic 
> generated partitions are is the average file size of files across all dynamic 
> partitions. It is very common that some dynamic partitions contains mostly 
> large files and some contains mostly small files. Even though the average 
> size of the total files are larger than the hive.merge.smallfiles.avgsize, we 
> should merge those partitions containing small files only. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1859) Hive's tinyint datatype is not supported by the Hive JDBC driver

2010-12-21 Thread Guy le Mar (JIRA)
Hive's tinyint datatype is not supported by the Hive JDBC driver


 Key: HIVE-1859
 URL: https://issues.apache.org/jira/browse/HIVE-1859
 Project: Hive
  Issue Type: Bug
  Components: Drivers
Affects Versions: 0.5.0
 Environment: Create a Hive table containing a tinyint column.
Then using the Hive JDBC driver execute a Hive query that selects data from 
this table. 
An error is then encountered.
Reporter: Guy le Mar


java.sql.SQLException: Could not create ResultSet: 
org.apache.hadoop.hive.serde2.dynamic_type.ParseException: Encountered "byte" 
at line 1, column 47.
Was expecting one of:
"bool" ...
"i16" ...
"i32" ...
"i64" ...
"double" ...
"string" ...
"map" ...
"list" ...
"set" ...
"required" ...
"optional" ...
"skip" ...
 ...
 ...
"}" ...

at 
org.apache.hadoop.hive.jdbc.HiveResultSet.initDynamicSerde(HiveResultSet.java:120)
at 
org.apache.hadoop.hive.jdbc.HiveResultSet.(HiveResultSet.java:74)
at 
org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:178)
at com.quest.orahive.HiveJdbcClient.main(HiveJdbcClient.java:117)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1860) Hive's smallint datatype is not supported by the Hive JDBC driver

2010-12-21 Thread Guy le Mar (JIRA)
Hive's smallint datatype is not supported by the Hive JDBC driver
-

 Key: HIVE-1860
 URL: https://issues.apache.org/jira/browse/HIVE-1860
 Project: Hive
  Issue Type: Bug
  Components: Drivers
Affects Versions: 0.5.0
 Environment: 
Create a Hive table containing a smallint column.
Then using the Hive JDBC driver execute a Hive query that selects data from 
this table. 
An error is then encountered.
Reporter: Guy le Mar


java.sql.SQLException: Inrecognized column type: i16
at 
org.apache.hadoop.hive.jdbc.HiveResultSetMetaData.getColumnType(HiveResultSetMetaData.java:132)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1861) Hive's float datatype is not supported by the Hive JDBC driver

2010-12-21 Thread Guy le Mar (JIRA)
Hive's float datatype is not supported by the Hive JDBC driver
--

 Key: HIVE-1861
 URL: https://issues.apache.org/jira/browse/HIVE-1861
 Project: Hive
  Issue Type: Bug
  Components: Drivers
Affects Versions: 0.5.0
 Environment: Create a Hive table containing a float column.
Then using the Hive JDBC driver execute a Hive query that selects data from 
this table. 
An error is then encountered.

Reporter: Guy le Mar


ERROR: DDL specifying type float which has not been defined
java.lang.RuntimeException: specifying type float which has not been defined
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.FieldType(thrift_grammar.java:1879)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Field(thrift_grammar.java:1545)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.FieldList(thrift_grammar.java:1501)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Struct(thrift_grammar.java:1171)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.TypeDefinition(thrift_grammar.java:497)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Definition(thrift_grammar.java:439)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Start(thrift_grammar.java:101)
at 
org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.initialize(DynamicSerDe.java:102)
at 
org.apache.hadoop.hive.jdbc.HiveResultSet.initDynamicSerde(HiveResultSet.java:117)
at 
org.apache.hadoop.hive.jdbc.HiveResultSet.(HiveResultSet.java:74)
at 
org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:178)
at com.quest.orahive.HiveJdbcClient.main(HiveJdbcClient.java:117)

org.apache.hadoop.hive.serde2.SerDeException: java.lang.RuntimeException: 
specifying type float which has not been defined
at 
org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.initialize(DynamicSerDe.java:117)
at 
org.apache.hadoop.hive.jdbc.HiveResultSet.initDynamicSerde(HiveResultSet.java:117)
at 
org.apache.hadoop.hive.jdbc.HiveResultSet.(HiveResultSet.java:74)
at 
org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:178)
at com.quest.orahive.HiveJdbcClient.main(HiveJdbcClient.java:117)
Caused by: java.lang.RuntimeException: specifying type float which has not been 
defined
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.FieldType(thrift_grammar.java:1879)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Field(thrift_grammar.java:1545)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.FieldList(thrift_grammar.java:1501)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Struct(thrift_grammar.java:1171)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.TypeDefinition(thrift_grammar.java:497)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Definition(thrift_grammar.java:439)
at 
org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Start(thrift_grammar.java:101)
at 
org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.initialize(DynamicSerDe.java:102)
... 4 more

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HIVE-1856) Implement DROP TABLE/VIEW ... IF EXISTS

2010-12-21 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi resolved HIVE-1856.
--

   Resolution: Fixed
Fix Version/s: 0.7.0
 Release Note: Backwards-compatible DROP behavior is controlled via new 
configuration parameter hive.exec.drop.ignorenonexistent
 Hadoop Flags: [Reviewed]

Committed.  Thanks Marcel!


> Implement DROP TABLE/VIEW ... IF EXISTS 
> 
>
> Key: HIVE-1856
> URL: https://issues.apache.org/jira/browse/HIVE-1856
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.7.0
>Reporter: Marcel Kornacker
>Assignee: Marcel Kornacker
> Fix For: 0.7.0
>
> Attachments: hive-1856.patch, hive-1856.patch
>
>
> This issue combines issues HIVE-1550/1165/1542/1551:
> - augment DROP TABLE/VIEW with IF EXISTS
> - signal an error if the table/view doesn't exist and IF EXISTS wasn't 
> specified
> - introduce a flag in the configuration that allows you to turn off the new 
> behavior

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1852) Reduce unnecessary DFSClient.rename() calls

2010-12-21 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1852:
-

Attachment: HIVE-1852.6.patch

Take Joy's comments. Also added one more test case for wildcard matching a 
single directory name (load_fs.q).

> Reduce unnecessary DFSClient.rename() calls
> ---
>
> Key: HIVE-1852
> URL: https://issues.apache.org/jira/browse/HIVE-1852
> Project: Hive
>  Issue Type: Improvement
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1852.2.patch, HIVE-1852.3.patch, HIVE-1852.4.patch, 
> HIVE-1852.5.patch, HIVE-1852.6.patch, HIVE-1852.patch
>
>
> In Hive client side (MoveTask etc), DFSCleint.rename() is called for every 
> file inside a directory. It is very expensive for a large directory in a busy 
> DFS namenode. We should replace it with a single rename() call on the whole 
> directory. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1856) Implement DROP TABLE/VIEW ... IF EXISTS

2010-12-21 Thread Jeff Hammerbacher (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973998#action_12973998
 ] 

Jeff Hammerbacher commented on HIVE-1856:
-

John: added your comments about patch updates to 
http://wiki.apache.org/hadoop/Hive/HowToContribute#Updating_a_patch

> Implement DROP TABLE/VIEW ... IF EXISTS 
> 
>
> Key: HIVE-1856
> URL: https://issues.apache.org/jira/browse/HIVE-1856
> Project: Hive
>  Issue Type: New Feature
>Affects Versions: 0.7.0
>Reporter: Marcel Kornacker
>Assignee: Marcel Kornacker
> Fix For: 0.7.0
>
> Attachments: hive-1856.patch, hive-1856.patch
>
>
> This issue combines issues HIVE-1550/1165/1542/1551:
> - augment DROP TABLE/VIEW with IF EXISTS
> - signal an error if the table/view doesn't exist and IF EXISTS wasn't 
> specified
> - introduce a flag in the configuration that allows you to turn off the new 
> behavior

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Review Request: HIVE-1696

2010-12-21 Thread John Sichi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/189/
---

Review request for hive.


Summary
---

HIVE-1696


This addresses bug HIVE-1696.
https://issues.apache.org/jira/browse/HIVE-1696


Diffs
-

  
http://svn.apache.org/repos/asf/hive/trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
 1050266 
  http://svn.apache.org/repos/asf/hive/trunk/metastore/if/hive_metastore.thrift 
1050266 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
 1050266 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1050266 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java
 1050266 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20/java/org/apache/hadoop/hive/shims/Hadoop20Shims.java
 1050266 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/shims/Hadoop20SShims.java
 1050266 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenIdentifier.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSecretManager.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/DelegationTokenSelector.java
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/0.20S/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge20S.java
 1050266 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/HadoopShims.java
 1050266 
  
http://svn.apache.org/repos/asf/hive/trunk/shims/src/common/java/org/apache/hadoop/hive/thrift/HadoopThriftAuthBridge.java
 1050266 

Diff: https://reviews.apache.org/r/189/diff


Testing
---


Thanks,

John



[jira] Updated: (HIVE-1818) Call frequency and duration metrics for HiveMetaStore via jmx

2010-12-21 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-1818:
-

Status: Patch Available  (was: Open)

> Call frequency and duration metrics for HiveMetaStore via jmx
> -
>
> Key: HIVE-1818
> URL: https://issues.apache.org/jira/browse/HIVE-1818
> Project: Hive
>  Issue Type: New Feature
>  Components: Metastore
>Reporter: Sushanth Sowmyan
>Priority: Minor
> Attachments: HIVE-1818.patch
>
>
> As recently brought up in the hive-dev mailing list, it'd be useful if the 
> HiveMetaStore had some sort of instrumentation capability so as to measure 
> frequency of calls to various calls on the HiveMetaStore and the duration of 
> time spent in these calls. 
> There are already incrementCounter() and logStartFunction() / 
> logStartTableFunction() ,etc calls in HiveMetaStore, and they could be 
> refactored/repurposed to make calls that expose JMX MBeans as well. Or, a 
> Metrics subsystem could be introduced which made calls to 
> incrementCounter()/etc as a refactor.
> It might also be possible to specify a -D parameter that the Metrics 
> subsystem could use to determine whether or not to be enabled, and if so, on 
> to what port. And once we have the capability to instrument and expose 
> MBeans, it might also be possible for other subsystems to also adopt and use 
> this system.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1806) The merge criteria on dynamic partitons should be per partiton

2010-12-21 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974101#action_12974101
 ] 

Namit Jain commented on HIVE-1806:
--

I will take a look

> The merge criteria on dynamic partitons should be per partiton
> --
>
> Key: HIVE-1806
> URL: https://issues.apache.org/jira/browse/HIVE-1806
> Project: Hive
>  Issue Type: Bug
>Reporter: Ning Zhang
>Assignee: Ning Zhang
> Attachments: HIVE-1806.2.patch, HIVE-1806.patch
>
>
> Currently the criteria of whether a merge job should be fired on dynamic 
> generated partitions are is the average file size of files across all dynamic 
> partitions. It is very common that some dynamic partitions contains mostly 
> large files and some contains mostly small files. Even though the average 
> size of the total files are larger than the hive.merge.smallfiles.avgsize, we 
> should merge those partitions containing small files only. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.