Build failed in Hudson: Hive-trunk-h0.18 #567

2010-10-12 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/567/changes

Changes:

[jvs] HIVE-1264. Make Hive work with Hadoop security
(Todd Lipcon via jvs)

--
[...truncated 31063 lines...]
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv3.txt
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.seq
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/complex.seq
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/json.txt
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_table1.q.out
 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_table1.q.out
[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket0.txt
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket1.txt
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket20.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket21.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket22.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/data/files/kv3.txt
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Copying data from 

HIVE-1701

2010-10-12 Thread John Sichi
I'm going to start working on a patch for dropping pre-0.20 Hadoop.

JVS



[jira] Commented: (HIVE-1699) incorrect partition pruning ANALYZE TABLE

2010-10-12 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12920343#action_12920343
 ] 

Namit Jain commented on HIVE-1699:
--

@Paul, are you working on optimizing Hive.getPartitionsByNames() by using 
partition filtering ?


 incorrect partition pruning ANALYZE TABLE
 -

 Key: HIVE-1699
 URL: https://issues.apache.org/jira/browse/HIVE-1699
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Ning Zhang
 Attachments: HIVE-1699.patch


 If table T is partitioned, ANALYZE TABLE T PARTITION (...) COMPUTE 
 STATISTICS; will gather stats for all partitions even though partition spec 
 only chooses a subset. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: blob handling in hive

2010-10-12 Thread Ted Yu
One way is to store blob in HBase and use HBaseHandler to access your blob.

On Tue, Oct 12, 2010 at 2:14 PM, Jinsong Hu jinsong...@hotmail.com wrote:

 Hi,
  I am using sqoop to export data from mysql to hive. I noticed that hive
 don't have blob data type yet. is there anyway I can do so hive can store
 blob ?

 Jimmy



[jira] Updated: (HIVE-1701) drop support for pre-0.20 Hadoop versions

2010-10-12 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1701:
-

Status: Patch Available  (was: Open)

 drop support for pre-0.20 Hadoop versions
 -

 Key: HIVE-1701
 URL: https://issues.apache.org/jira/browse/HIVE-1701
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Affects Versions: 0.7.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.7.0

 Attachments: HIVE-1701.1.patch


 As discussed previously on the mailing lists, we're dropping support for 
 pre-0.20 Hadoop versions starting with Hive 0.7.  This JIRA issue is for 
 deleting the corresponding build and shim implementations.  The shim 
 mechanism itself will be left in place (we already have 0.20 and 0.20S 
 coexisting).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1701) drop support for pre-0.20 Hadoop versions

2010-10-12 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1701:
-

Attachment: HIVE-1701.1.patch

Note that this patch deletes a lot of files; the committer needs to do the 
following explicitly:

cd shims/src
svn remove 0.17 0.18 0.19


 drop support for pre-0.20 Hadoop versions
 -

 Key: HIVE-1701
 URL: https://issues.apache.org/jira/browse/HIVE-1701
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Build Infrastructure
Affects Versions: 0.7.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.7.0

 Attachments: HIVE-1701.1.patch


 As discussed previously on the mailing lists, we're dropping support for 
 pre-0.20 Hadoop versions starting with Hive 0.7.  This JIRA issue is for 
 deleting the corresponding build and shim implementations.  The shim 
 mechanism itself will be left in place (we already have 0.20 and 0.20S 
 coexisting).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1704) remove Hadoop 0.17 specific test reference logs

2010-10-12 Thread John Sichi (JIRA)
remove Hadoop 0.17 specific test reference logs
---

 Key: HIVE-1704
 URL: https://issues.apache.org/jira/browse/HIVE-1704
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Testing Infrastructure
Affects Versions: 0.7.0
Reporter: John Sichi
Assignee: John Sichi
 Fix For: 0.7.0


While we were supporting Hadoop 0.17, we introduced a test mechanism to allow 
for different expected output for different Hadoop versions.  As of HIVE-1701, 
we're dropping support for pre-0.20 Hadoop versions, so we can delete the 
*_0.17 files (but keep the test infra mechanism).


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: blob handling in hive

2010-10-12 Thread Ted Yu
How about creating org.apache.hadoop.hive.serde2.io.BytesWritable which
wraps byte[] ?

On Tue, Oct 12, 2010 at 3:49 PM, Jinsong Hu jinsong...@hotmail.com wrote:

 storing the blob in hbase is too costly. hbase compaction costs lots of
 cpu. All I want to do is to be able to read the byte array out of a sequence
 file, and map that byte array to an hive column.
 I can write a SerDe for this purpose.

 I tried to define the data to be arraytinyint. I then tried to write
 custom  SerDe, after  I get the byte array out  of the disk, I need to map
 it,

  so I wrote the code:
 columnTypes
 =TypeInfoUtils.getTypeInfosFromTypeString(int,string,arraytinyint);

 but then how to I convert the data in the row.set() method ?

 I tried this:

   byte [] bContent=ev.get_content()==null ? null :
 (ev.get_content().getData()==null ? null : ev.get_content().getData());
   org.apache.hadoop.hive.serde2.io.ByteWritable tContent =
 bContent==null ? new org.apache.hadoop.hive.serde2.io.ByteWritable() :  new
 org.apache.hadoop.hive.serde2.io.ByteWritable(bContent[0]) ;
row.set(2, tContent);

 this works for a single byte, but doesn't work for byte array.
 Any way that I can get the byte array returned in sql is appreciated.

 Jimmy

 --
 From: Ted Yu yuzhih...@gmail.com
 Sent: Tuesday, October 12, 2010 2:19 PM
 To: dev@hive.apache.org
 Subject: Re: blob handling in hive


  One way is to store blob in HBase and use HBaseHandler to access your
 blob.

 On Tue, Oct 12, 2010 at 2:14 PM, Jinsong Hu jinsong...@hotmail.com
 wrote:

  Hi,
  I am using sqoop to export data from mysql to hive. I noticed that hive
 don't have blob data type yet. is there anyway I can do so hive can store
 blob ?

 Jimmy





Re: blob handling in hive

2010-10-12 Thread Jinsong Hu
I thought about that too. but then I need to write an bytes inspector and 
stick that into hive inspector factory.  we also need to create a new 
datatype , such as blob , in hive's supported
data types. Adding a new supported data type to hive is a non-trivial task, 
as more code will need to be touched.


I am just wondering if it is possible to get what I want to do without such 
big change.



Jimmy.

--
From: Ted Yu yuzhih...@gmail.com
Sent: Tuesday, October 12, 2010 4:12 PM
To: dev@hive.apache.org
Subject: Re: blob handling in hive


How about creating org.apache.hadoop.hive.serde2.io.BytesWritable which
wraps byte[] ?

On Tue, Oct 12, 2010 at 3:49 PM, Jinsong Hu jinsong...@hotmail.com 
wrote:



storing the blob in hbase is too costly. hbase compaction costs lots of
cpu. All I want to do is to be able to read the byte array out of a 
sequence

file, and map that byte array to an hive column.
I can write a SerDe for this purpose.

I tried to define the data to be arraytinyint. I then tried to write
custom  SerDe, after  I get the byte array out  of the disk, I need to 
map

it,

 so I wrote the code:
columnTypes
=TypeInfoUtils.getTypeInfosFromTypeString(int,string,arraytinyint);

but then how to I convert the data in the row.set() method ?

I tried this:

  byte [] bContent=ev.get_content()==null ? null :
(ev.get_content().getData()==null ? null : ev.get_content().getData());
  org.apache.hadoop.hive.serde2.io.ByteWritable tContent =
bContent==null ? new org.apache.hadoop.hive.serde2.io.ByteWritable() : 
new

org.apache.hadoop.hive.serde2.io.ByteWritable(bContent[0]) ;
   row.set(2, tContent);

this works for a single byte, but doesn't work for byte array.
Any way that I can get the byte array returned in sql is appreciated.

Jimmy

--
From: Ted Yu yuzhih...@gmail.com
Sent: Tuesday, October 12, 2010 2:19 PM
To: dev@hive.apache.org
Subject: Re: blob handling in hive


 One way is to store blob in HBase and use HBaseHandler to access your

blob.

On Tue, Oct 12, 2010 at 2:14 PM, Jinsong Hu jinsong...@hotmail.com
wrote:

 Hi,
 I am using sqoop to export data from mysql to hive. I noticed that 
hive
don't have blob data type yet. is there anyway I can do so hive can 
store

blob ?

Jimmy








[jira] Resolved: (HIVE-1669) non-deterministic display of storage parameter in test

2010-10-12 Thread Ning Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang resolved HIVE-1669.
--

Resolution: Duplicate

This task is fixed as part of HIVE-1658. Closing. 

 non-deterministic display of storage parameter in test
 --

 Key: HIVE-1669
 URL: https://issues.apache.org/jira/browse/HIVE-1669
 Project: Hadoop Hive
  Issue Type: Sub-task
Reporter: Ning Zhang
Assignee: Thiruvel Thirumoolan
 Attachments: HIVE-1669.patch


 With the change to beautify the 'desc extended table', the storage parameters 
 are displayed in non-deterministic manner (since its implementation is 
 HashMap). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.