[jira] Commented: (HIVE-417) Implement Indexing in Hive

2009-04-25 Thread Prasad Chakka (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12702769#action_12702769
 ] 

Prasad Chakka commented on HIVE-417:


HIVE-1230 has changed the interface for RecordReader and it no longer has 
getPos() method. The older interfaces are deprecated. I used this method in the 
prototype get the current position while creating the index and also while 
reading the actual data file. Even the SequenceFileRecordReader does not have 
this method. 

Without getPos() and seek() methods to RecordReader it becomes tough to 
implement any kind of generic indexing.


 Implement Indexing in Hive
 --

 Key: HIVE-417
 URL: https://issues.apache.org/jira/browse/HIVE-417
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Metastore, Query Processor
Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.4.0
Reporter: Prasad Chakka

 Implement indexing on Hive so that lookup and range queries are efficient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-446) Implement TRUNCATE

2009-04-25 Thread Prasad Chakka (JIRA)
Implement TRUNCATE
--

 Key: HIVE-446
 URL: https://issues.apache.org/jira/browse/HIVE-446
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Prasad Chakka
 Fix For: 0.4.0


truncate the data but leave the table and metadata intact.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-446) Implement TRUNCATE

2009-04-25 Thread Prasad Chakka (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12702771#action_12702771
 ] 

Prasad Chakka commented on HIVE-446:


work around from matt prestritto:
Insert overwrite tbl select * from tbl where col1 = 'fake value';

 Implement TRUNCATE
 --

 Key: HIVE-446
 URL: https://issues.apache.org/jira/browse/HIVE-446
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Prasad Chakka
 Fix For: 0.4.0


 truncate the data but leave the table and metadata intact.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Hive-trunk-h0.18 #74

2009-04-25 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/74/changes

Changes:

[zshao] HIVE-442. Move the data before creating the partition. (Prasad Chakka 
via zshao)

[zshao] HIVE-324. Fix AccessControlException when loading data. (Ashish Thusoo 
via zshao)

[johan] HIVE-433. Fixed union18 and union19 tests. (athusoo via johan)

[zshao] HIVE-376. In strict mode do not allow join without ON condition. 
(Namit Jain via zshao)

[zshao] HIVE-250. Shared memory java dbm for map-side joins. (Joydeep Sen Sarma 
via zshao)

[namit] HIVE-366. testParse should not depend on a static field
(Zheng Shao via namit)

[namit] HIVE-437. Allow both table.name and col.field
(Zheng Shao via namit)

--
[...truncated 34903 lines...]
[junit] Loading data to table src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_column6.q.out
  
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_column6.q.out
 
[junit] Done query: unknown_column6.q
[junit] Begin query: unknown_function1.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table src
[junit] OK
[junit] Loading data to table src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_function1.q.out
  
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_function1.q.out
 
[junit] Done query: unknown_function1.q
[junit] Begin query: unknown_function2.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table src
[junit] OK
[junit] Loading data to table src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_function2.q.out
  
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_function2.q.out
 
[junit] Done query: unknown_function2.q
[junit] Begin query: unknown_function3.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table src
[junit] OK
[junit] Loading data to table src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/build/ql/test/logs/negative/unknown_function3.q.out
  
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/ws/hive/ql/src/test/results/compiler/errors/unknown_function3.q.out
 
[junit] Done query: unknown_function3.q
[junit] Begin query: unknown_function4.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
   

Build failed in Hudson: Hive-trunk-h0.19 #72

2009-04-25 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/72/changes

Changes:

[zshao] HIVE-442. Move the data before creating the partition. (Prasad Chakka 
via zshao)

[zshao] HIVE-324. Fix AccessControlException when loading data. (Ashish Thusoo 
via zshao)

[johan] HIVE-433. Fixed union18 and union19 tests. (athusoo via johan)

[zshao] HIVE-376. In strict mode do not allow join without ON condition. 
(Namit Jain via zshao)

[zshao] HIVE-250. Shared memory java dbm for map-side joins. (Joydeep Sen Sarma 
via zshao)

[namit] HIVE-366. testParse should not depend on a static field
(Zheng Shao via namit)

--
[...truncated 34085 lines...]
[junit] Loading data to table src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_column6.q.out
  
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_column6.q.out
 
[junit] Done query: unknown_column6.q
[junit] Begin query: unknown_function1.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table src
[junit] OK
[junit] Loading data to table src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_function1.q.out
  
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_function1.q.out
 
[junit] Done query: unknown_function1.q
[junit] Begin query: unknown_function2.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table src
[junit] OK
[junit] Loading data to table src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_function2.q.out
  
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_function2.q.out
 
[junit] Done query: unknown_function2.q
[junit] Begin query: unknown_function3.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=12}
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table srcbucket
[junit] OK
[junit] Loading data to table src
[junit] OK
[junit] Loading data to table src1
[junit] OK
[junit] Loading data to table src_sequencefile
[junit] OK
[junit] Loading data to table src_thrift
[junit] OK
[junit] Loading data to table src_json
[junit] OK
[junit] diff 
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/build/ql/test/logs/negative/unknown_function3.q.out
  
http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.19/ws/hive/ql/src/test/results/compiler/errors/unknown_function3.q.out
 
[junit] Done query: unknown_function3.q
[junit] Begin query: unknown_function4.q
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-08, hr=12}
[junit] OK
[junit] Loading data to table srcpart partition {ds=2008-04-09, hr=11}
[junit] OK
[junit] Loading data to table srcpart partition 

[jira] Created: (HIVE-447) Test join32 fails on hudson

2009-04-25 Thread Johan Oskarsson (JIRA)
Test join32 fails on hudson
---

 Key: HIVE-447
 URL: https://issues.apache.org/jira/browse/HIVE-447
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Johan Oskarsson
 Fix For: 0.4.0


See this build for more information

http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/lastBuild/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_join23/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-447) Test join23 fails on hudson

2009-04-25 Thread Johan Oskarsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Oskarsson updated HIVE-447:
-

Summary: Test join23 fails on hudson  (was: Test join32 fails on hudson)

 Test join23 fails on hudson
 ---

 Key: HIVE-447
 URL: https://issues.apache.org/jira/browse/HIVE-447
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Johan Oskarsson
 Fix For: 0.4.0


 See this build for more information
 http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/lastBuild/testReport/org.apache.hadoop.hive.cli/TestCliDriver/testCliDriver_join23/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



more ClassCastExceptions with joins, group by and non-strings

2009-04-25 Thread Peter Alvaro
Hi,
It appears that queries with all three of (join, group by, non-string
datatype) cause a crash in the serde code run at the reducer:


java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassCastException:
org.apache.hadoop.hive.serde2.io.DoubleWritable cannot be cast to
org.apache.hadoop.io.Text
at org.apache.hadoop.hive.ql.exec.ExecReducer.reduce(ExecReducer.java:179)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:430)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.lang.ClassCastException:
org.apache.hadoop.hive.serde2.io.DoubleWritable cannot be cast to
org.apache.hadoop.io.Text
at
org.apache.hadoop.hive.ql.exec.GroupByOperator.process(GroupByOperator.java:489)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:359)


the workaround mentioned in the FAQ(3) (and reported fixed by HIVE-405,
although this seems to be a different issue) does not seem to fix the
problem, which has existed since hive revision 764548.  I am using hadoop
v.0.19, though I get the same errors when I use the latest trunk.

this script (which includes the workaround) captures the issue:


-
create table foo (
bas string,
bam double
)

ROW FORMAT DELIMITED FIELDS TERMINATED BY '\174';

create table bar (
bas string--,
--bat double
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\174';



load data local inpath '/PATH/TO/demo_foo.txt' overwrite into table foo;
load data local inpath '/PATH/TO/demo_bar.txt' overwrite into table bar;


select f.bas, cast(bam as string)
from foo f
join bar b on (f.bas = b.bas)
group by f.bas, cast(bam as string);

---

contents of demo_foo.txt:

11234325|0.123
221346|10.12
33463246|100.25
432462634|0.12
5346236|345.12


contents of demo_bar:

11234325|0.1222
221346|1.11
33463246|235.23
432462634|6.33
5346236|77.77

thanks,

Peter Alvaro
UC Berkeley


building metastore thriftif

2009-04-25 Thread Edward Capriolo
I have read that thrift from apache thrift is not compatible with hive
yet. I need to rebuild the metastore what version of thrift should I
use? Where can i get it.

Thanks,
Edward