[jira] Updated: (HIVE-576) complete jdbc driver

2009-09-16 Thread Bill Graham (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Graham updated HIVE-576:
-

Component/s: Clients

> complete jdbc driver
> 
>
> Key: HIVE-576
> URL: https://issues.apache.org/jira/browse/HIVE-576
> Project: Hadoop Hive
>  Issue Type: Improvement
>  Components: Clients
>Affects Versions: 0.4.0
>Reporter: Min Zhou
>Assignee: Min Zhou
> Fix For: 0.5.0
>
> Attachments: HIVE-576.1.patch, HIVE-576.2.patch, sqlexplorer.jpg
>
>
> hive only support a few interfaces of jdbc, let's complete it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-78) Authentication infrastructure for Hive

2009-09-16 Thread Min Zhou (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756335#action_12756335
 ] 

Min Zhou commented on HIVE-78:
--

@Edward

Sorry for my abuse of some words,  I hope this will not affect our work.  

Can you give me the jiras you decided not to store username/password 
information in hive and hadoop will?   
I think most companies are using hadoop versions from 0.17 to 0.20 , which 
don't have good password securities. Once  a company takes a particular 
version, upgrades for them is a very important issue, many companies will adopt 
a more stable version. Moreover, now hadoop still do not have that feature, 
which may cost a very long time to implement.  Why should we are waiting for, 
rather than accomplish it? I think Hive is necessary to support user/password 
at least for current versions of hadoop. There are many companies who are using 
hive reflected that current hive is inconvenient for multi-user, as long as 
environment isolation, table sharing, security, etc. We must try to meet the 
requirements of most of them.

Regarding the syntax, I guess we can do it in two steps. 
# support GRANT/REVOKE privileges to users.
# support some sort of server administration privileges as Ashish metioned. 
The GRANT statement enables system administrators to create Hive user accounts 
and to grant rights to accounts. To use GRANT, you must have the GRANT OPTION 
privilege, and you must have the privileges that you are grantingad. The REVOKE 
statement is related and enables ministrators to remove account privileges.

 File hive-78-syntax-v1.patch modifies the syntax. Any comments on that?


> Authentication infrastructure for Hive
> --
>
> Key: HIVE-78
> URL: https://issues.apache.org/jira/browse/HIVE-78
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Ashish Thusoo
>Assignee: Edward Capriolo
> Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, 
> hive-78.diff
>
>
> Allow hive to integrate with existing user repositories for authentication 
> and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-840) no error if user specifies multiple columns of same name as output

2009-09-16 Thread Namit Jain (JIRA)
no error if user specifies multiple columns of same name as output
--

 Key: HIVE-840
 URL: https://issues.apache.org/jira/browse/HIVE-840
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain


INSERT OVERWRITE TABLE table_name_here
SELECT TRANSFORM(key,val)
USING '/script/'
AS foo, foo, foo


The above query should fail, but it succeeds

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-839) change default value of hive.mapred.mode to strict

2009-09-16 Thread Namit Jain (JIRA)
change default value of hive.mapred.mode to strict
--

 Key: HIVE-839
 URL: https://issues.apache.org/jira/browse/HIVE-839
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain


Most of the production servers would run in strict mode; so it might be a good 
idea to have that as the default.
It helps catching some bugs which might get unnoticed in nonstrict mode.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-838) in strict mode, no partition selected error

2009-09-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-838:


Attachment: hive.838.2.patch

incorporated Raghu's comments

> in strict mode, no partition selected error
> ---
>
> Key: HIVE-838
> URL: https://issues.apache.org/jira/browse/HIVE-838
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.838.1.patch, hive.838.2.patch
>
>
> set hive.mapred.mode=strict;
> select * from 
>   (select count(1) from src 
> union all
>select count(1) from srcpart where ds = '2009-08-09'
>   )x;
> Is it a blocker for 0.4 ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-838) in strict mode, no partition selected error

2009-09-16 Thread Raghotham Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756237#action_12756237
 ] 

Raghotham Murthy commented on HIVE-838:
---

i guess it would be good to include this fix in 0.4.

couple of comments:

1. can you change the query with the empty partition to be a non-aggregate 
query? need to make sure additional rows are not added to the result. something 
like:
{code}
select * from 
  (select count(1) from src 
union all
   select 1 from srcpart where ds = '2009-08-09'
  )x;
{code}

2. remove the code instead of commenting it out.

> in strict mode, no partition selected error
> ---
>
> Key: HIVE-838
> URL: https://issues.apache.org/jira/browse/HIVE-838
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.838.1.patch
>
>
> set hive.mapred.mode=strict;
> select * from 
>   (select count(1) from src 
> union all
>select count(1) from srcpart where ds = '2009-08-09'
>   )x;
> Is it a blocker for 0.4 ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-838) in strict mode, no partition selected error

2009-09-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-838:


Status: Patch Available  (was: Open)

> in strict mode, no partition selected error
> ---
>
> Key: HIVE-838
> URL: https://issues.apache.org/jira/browse/HIVE-838
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.838.1.patch
>
>
> set hive.mapred.mode=strict;
> select * from 
>   (select count(1) from src 
> union all
>select count(1) from srcpart where ds = '2009-08-09'
>   )x;
> Is it a blocker for 0.4 ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-838) in strict mode, no partition selected error

2009-09-16 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain updated HIVE-838:


Attachment: hive.838.1.patch

> in strict mode, no partition selected error
> ---
>
> Key: HIVE-838
> URL: https://issues.apache.org/jira/browse/HIVE-838
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Reporter: Namit Jain
>Assignee: Namit Jain
> Attachments: hive.838.1.patch
>
>
> set hive.mapred.mode=strict;
> select * from 
>   (select count(1) from src 
> union all
>select count(1) from srcpart where ds = '2009-08-09'
>   )x;
> Is it a blocker for 0.4 ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-838) in strict mode, no partition selected error

2009-09-16 Thread Namit Jain (JIRA)
in strict mode, no partition selected error
---

 Key: HIVE-838
 URL: https://issues.apache.org/jira/browse/HIVE-838
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Namit Jain
Assignee: Namit Jain



set hive.mapred.mode=strict;

select * from 
  (select count(1) from src 
union all
   select count(1) from srcpart where ds = '2009-08-09'
  )x;


Is it a blocker for 0.4 ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-837) virtual column support (filename) in hive

2009-09-16 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756185#action_12756185
 ] 

Edward Capriolo commented on HIVE-837:
--

also "describe partition show files" would be useful.

> virtual column support (filename) in hive
> -
>
> Key: HIVE-837
> URL: https://issues.apache.org/jira/browse/HIVE-837
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Reporter: Namit Jain
>
> Copying from some mails:
> I am dumping files into a hive partion on five minute intervals. I am using 
> LOAD DATA into a partition.
> weblogs
> web1.00
> web1.05
> web1.10
> ...
> web2.00
> web2.05
> web1.10
> 
> Things that would be useful..
> Select files from the folder with a regex or exact name
> select * FROM logs where FILENAME LIKE(WEB1*)
> select * FROM LOGS WHERE FILENAME=web2.00
> Also it would be nice to be able to select offsets in a file, this would make 
> sense with appends
> select * from logs WHERE FILENAME=web2.00 FROMOFFSET=454644 [TOOFFSET=]
> select  
> substr(filename, 4, 7) as  class_A, 
> substr(filename,  8, 10) as class_B
> count( x ) as cnt
> from FOO
> group by
> substr(filename, 4, 7), 
> substr(filename,  8, 10) ;
> Hive should support virtual columns

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-837) virtual column support (filename) in hive

2009-09-16 Thread Namit Jain (JIRA)
virtual column support (filename) in hive
-

 Key: HIVE-837
 URL: https://issues.apache.org/jira/browse/HIVE-837
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain


Copying from some mails:


I am dumping files into a hive partion on five minute intervals. I am using 
LOAD DATA into a partition.

weblogs
web1.00
web1.05
web1.10
...
web2.00
web2.05
web1.10


Things that would be useful..

Select files from the folder with a regex or exact name

select * FROM logs where FILENAME LIKE(WEB1*)

select * FROM LOGS WHERE FILENAME=web2.00

Also it would be nice to be able to select offsets in a file, this would make 
sense with appends

select * from logs WHERE FILENAME=web2.00 FROMOFFSET=454644 [TOOFFSET=]




select  
substr(filename, 4, 7) as  class_A, 
substr(filename,  8, 10) as class_B
count( x ) as cnt
from FOO
group by
substr(filename, 4, 7), 
substr(filename,  8, 10) ;



Hive should support virtual columns

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



RE: vote for release candidate for hive

2009-09-16 Thread Namit Jain
Generated the tar balls:


Look at

/home/namit/public_html/hive-0.4.0-cadidate-1/hive-0.4.0-dev*

in people.apache.org


Thanks,
-namit


From: Ashish Thusoo [mailto:athu...@facebook.com]
Sent: Tuesday, September 15, 2009 4:41 PM
To: hive-u...@hadoop.apache.org
Subject: RE: vote for release candidate for hive

We need to generate the tar balls for the binary and the source release and put 
them out for vote.

Ashish


From: Namit Jain [mailto:nj...@facebook.com]
Sent: Tuesday, September 15, 2009 1:30 PM
To: hive-u...@hadoop.apache.org
Subject: vote for release candidate for hive

I have created another release candidate for Hive.



 https://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.4.0-rc1/





Let me know if it is OK to publish this release candidate.



The only change from the previous candidate 
(https://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.4.0-rc0/) is the 
fix for

https://issues.apache.org/jira/browse/HIVE-718







Thanks,

-namit





[jira] Created: (HIVE-836) Add syntax to force a new mapreduce job / transform subquery in mapper

2009-09-16 Thread Adam Kramer (JIRA)
Add syntax to force a new mapreduce job / transform subquery in mapper
--

 Key: HIVE-836
 URL: https://issues.apache.org/jira/browse/HIVE-836
 Project: Hadoop Hive
  Issue Type: Wish
Reporter: Adam Kramer


Hive currently does a lot of awesome work to figure out when my transformers 
should be used in the mapper and when they should be used in the reducer. 
However, sometimes I have a different plan.

For example, consider this:

SELECT TRANSFORM(a.val1, a.val2)
USING './niftyscript'
AS part1, part2, part3
FROM (
SELECT b.val AS val1, c.val AS val2
FROM tblb b JOIN tblc c on (b.key=c.key)
) a

...in this syntax b and c will be joined (in the reducer, of course), and then 
the rows that pass the join clause will be passed to niftyscript _in the 
reducer._ However, when niftyscript is high-computation and there is a lot of 
data coming out of the join but very few reducers, there's a huge hold-up. It 
would be awesome if I could somehow force a new mapreduce step after the 
subquery, so that ./niftyscript is run in the mappers rather than the prior 
step's reducers.

Current workaround is to dump everything to a temporary table and then start 
over, but that is not an easy to scale--the subquery structure effectively (and 
easily) "locks" the mid-points so no other job can touch the table.

SUGGESTED FIX: Either cause MAP and REDUCE to force map/reduce steps (c.f. 
https://issues.apache.org/jira/browse/HIVE-835 ), or add a query element to 
specify that "the job ends here." For example, in the above query, FROM a 
SELF-CONTAINED or PRECOMPUTE a or START JOB AFTER a or something like that.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-835) Make MAP and REDUCE work as expected or add warnings

2009-09-16 Thread Adam Kramer (JIRA)
Make MAP and REDUCE work as expected or add warnings


 Key: HIVE-835
 URL: https://issues.apache.org/jira/browse/HIVE-835
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Adam Kramer


There are syntactic elements MAP and REDUCE which function as syntactic sugar 
for SELECT TRANSFORM. This behavior is not at all intuitive, because no 
checking or verification is done to ensure that the user's intention is met.

Specifically, Hive may see a MAP query and simply tack the transform script on 
to the end of a reduce job (so, the user says MAP but hive does a REDUCE), or 
(more dangerously) vice-versa. Given that Hive's whole point is to sit on top 
of a mapreduce framework and allow transformations in the mapper or reducer, it 
seems very inappropriate for Hive to ignore a clear command from the user to 
MAP or to REDUCE the data using a script, and then simply ignore it.

Better behavior would be for hive to see a MAP command and to start a new 
mapreduce step and run the command in the mapper (even if it otherwise would be 
run in the reducer), and for REDUCE to begin a reduce step if necessary (so, 
tack the REDUCE script on to the end of a REDUCE job if the current system 
would do so, or if not, treat the 0th column as the reduce key, throw a warning 
saying this has been done, and force a reduce job).

Acceptable behavior would be to throw an error or warning when the user's 
clearly-stated desire is going to be ignored. "Warning: User used MAP keyword, 
but transformation will occur in the reduce phase" / "Warning: User used REDUCE 
keyword, but did not specify DISTRIBUTE BY / CLUSTER BY column. Transformation 
will occur in the map phase."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: vote for release candidate for hive

2009-09-16 Thread Johan Oskarsson
+1 based on running unit tests.

/Johan

Namit Jain wrote:
> Sorry, was meant for hive-dev@
> 
> From: Namit Jain [mailto:nj...@facebook.com]
> Sent: Tuesday, September 15, 2009 1:30 PM
> To: hive-u...@hadoop.apache.org
> Subject: vote for release candidate for hive
> 
> 
> I have created another release candidate for Hive.
> 
> 
> 
>  https://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.4.0-rc1/
> 
> 
> 
> 
> 
> Let me know if it is OK to publish this release candidate.
> 
> 
> 
> The only change from the previous candidate 
> (https://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.4.0-rc0/) is the 
> fix for
> 
> https://issues.apache.org/jira/browse/HIVE-718
> 
> 
> 
> 
> 
> 
> 
> Thanks,
> 
> -namit
> 
> 
> 
> 



[jira] Commented: (HIVE-78) Authentication infrastructure for Hive

2009-09-16 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12756068#action_12756068
 ] 

Edward Capriolo commented on HIVE-78:
-

Min,

First, let me say you have probably come along much farther then me on this 
issue.

Your approach is too strong. Hive is an open-community process. Through it is 
not very detailed we have loosely agreed on a spec (above), in that spec we 
have decided not to store username/password information in hive. Rather 
upstream is still going to be responsible for this information. We also agreed 
on syntax.

You should not throw up a new spec, and some code, and say something along the 
lines of  "We are going to take over and do it this way". Imagine if each jira 
issue you working on you were 20% to 50% done. And then someone jumped in and 
said "I already finished it a different way", that would be rather annoying. It 
would be a "first patch wins" system. 

First, before you are going to write a line of code you should let someone know 
your intention to work on it. Otherwise what is the point of having two people 
work on something where one version gets thrown away? It is a waste, and this 
would be the second issue this has happened to me. 

Second even if you want to starting coding it up it has to be what people 
agreed on. We agreed not to store user/pass (hadoop will be doing this upstream 
soon), and we agreed on syntax, if you want to reopen that issue you should 
discuss it before coding it. It has to be good for the community, not just your 
deployment.

So where do we go from here? Do we go back to the design phase and describe all 
the syntax we want to support?

> Authentication infrastructure for Hive
> --
>
> Key: HIVE-78
> URL: https://issues.apache.org/jira/browse/HIVE-78
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Ashish Thusoo
>Assignee: Edward Capriolo
> Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, 
> hive-78.diff
>
>
> Allow hive to integrate with existing user repositories for authentication 
> and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Hive-trunk-h0.18 #218

2009-09-16 Thread Apache Hudson Server
See http://hudson.zones.apache.org/hudson/job/Hive-trunk-h0.18/218/

--
started
Building remotely on minerva.apache.org (Ubuntu)
ERROR: svn: timed out waiting for server
svn: OPTIONS request failed on '/repos/asf/hadoop/hive/trunk'
org.tmatesoft.svn.core.SVNException: svn: timed out waiting for server
svn: OPTIONS request failed on '/repos/asf/hadoop/hive/trunk'
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:103)
at 
org.tmatesoft.svn.core.internal.wc.SVNErrorManager.error(SVNErrorManager.java:87)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:601)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:257)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:245)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.exchangeCapabilities(DAVConnection.java:454)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVConnection.open(DAVConnection.java:97)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.openConnection(DAVRepository.java:664)
at 
org.tmatesoft.svn.core.internal.io.dav.DAVRepository.testConnection(DAVRepository.java:96)
at 
hudson.scm.SubversionSCM$DescriptorImpl.checkRepositoryPath(SubversionSCM.java:1519)
at 
hudson.scm.SubversionSCM.repositoryLocationsExist(SubversionSCM.java:1620)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:455)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:418)
at hudson.model.AbstractProject.checkout(AbstractProject.java:801)
at 
hudson.model.AbstractBuild$AbstractRunner.checkout(AbstractBuild.java:314)
at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:266)
at hudson.model.Run.run(Run.java:896)
at hudson.model.Build.run(Build.java:112)
at hudson.model.ResourceController.execute(ResourceController.java:93)
at hudson.model.Executor.run(Executor.java:119)
Caused by: java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:519)
at 
org.tmatesoft.svn.core.internal.util.SVNSocketFactory.createPlainSocket(SVNSocketFactory.java:53)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.connect(HTTPConnection.java:167)
at 
org.tmatesoft.svn.core.internal.io.dav.http.HTTPConnection.request(HTTPConnection.java:307)
... 17 more
Recording test results



[jira] Updated: (HIVE-78) Authentication infrastructure for Hive

2009-09-16 Thread Min Zhou (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-78:
-

Attachment: hive-78-metadata-v1.patch

> Authentication infrastructure for Hive
> --
>
> Key: HIVE-78
> URL: https://issues.apache.org/jira/browse/HIVE-78
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Server Infrastructure
>Reporter: Ashish Thusoo
>Assignee: Edward Capriolo
> Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, 
> hive-78.diff
>
>
> Allow hive to integrate with existing user repositories for authentication 
> and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-193) Cannot create table with tinyint type column

2009-09-16 Thread Johan Oskarsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Oskarsson updated HIVE-193:
-

 Priority: Major  (was: Critical)
Fix Version/s: (was: 0.4.0)
   0.5.0

> Cannot create table with tinyint type column
> 
>
> Key: HIVE-193
> URL: https://issues.apache.org/jira/browse/HIVE-193
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor, Serializers/Deserializers
>Affects Versions: 0.2.0
>Reporter: Johan Oskarsson
> Fix For: 0.5.0
>
>
> Running this query "create table something2 (test tinyint);" gives the 
> following exception:
> org.apache.hadoop.hive.serde2.dynamic_type.ParseException: Encountered "byte" 
> at line 1, column 21.
> Was expecting one of:
> "bool" ...
> "i16" ...
> "i32" ...
> "i64" ...
> "double" ...
> "string" ...
> "map" ...
> "list" ...
> "set" ...
> "required" ...
> "optional" ...
> "skip" ...
>  ...
>  ...
> "}" ...
> 
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.generateParseException(thrift_grammar.java:2321)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.jj_consume_token(thrift_grammar.java:2253)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Struct(thrift_grammar.java:1172)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.TypeDefinition(thrift_grammar.java:497)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Definition(thrift_grammar.java:439)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Start(thrift_grammar.java:101)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.initialize(DynamicSerDe.java:97)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:180)
>   at org.apache.hadoop.hive.ql.metadata.Table.initSerDe(Table.java:141)
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:202)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:641)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:98)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:215)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:305)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-192) Cannot create table with timestamp type column

2009-09-16 Thread Johan Oskarsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Oskarsson updated HIVE-192:
-

Fix Version/s: (was: 0.4.0)
   0.5.0
   Issue Type: New Feature  (was: Bug)

> Cannot create table with timestamp type column
> --
>
> Key: HIVE-192
> URL: https://issues.apache.org/jira/browse/HIVE-192
> Project: Hadoop Hive
>  Issue Type: New Feature
>  Components: Query Processor
>Affects Versions: 0.2.0
>Reporter: Johan Oskarsson
>Assignee: Shyam Sundar Sarkar
> Fix For: 0.5.0
>
> Attachments: create_2.q.txt, TIMESTAMP_specification.txt
>
>
> create table something2 (test timestamp);
> ERROR: DDL specifying type timestamp which has not been defined
> java.lang.RuntimeException: specifying type timestamp which has not been 
> defined
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.FieldType(thrift_grammar.java:1879)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Field(thrift_grammar.java:1545)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.FieldList(thrift_grammar.java:1501)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Struct(thrift_grammar.java:1171)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.TypeDefinition(thrift_grammar.java:497)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Definition(thrift_grammar.java:439)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.thrift_grammar.Start(thrift_grammar.java:101)
>   at 
> org.apache.hadoop.hive.serde2.dynamic_type.DynamicSerDe.initialize(DynamicSerDe.java:97)
>   at 
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:180)
>   at org.apache.hadoop.hive.ql.metadata.Table.initSerDe(Table.java:141)
>   at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:202)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:641)
>   at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:98)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:215)
>   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:174)
>   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:207)
>   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:305)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-302) Implement "LINES TERMINATED BY"

2009-09-16 Thread Johan Oskarsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Oskarsson updated HIVE-302:
-

Fix Version/s: (was: 0.4.0)
   0.5.0

> Implement "LINES TERMINATED BY"
> ---
>
> Key: HIVE-302
> URL: https://issues.apache.org/jira/browse/HIVE-302
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Serializers/Deserializers
>Reporter: Johan Oskarsson
> Fix For: 0.5.0
>
>
> Specifying "LINES TERMINATED BY" when creating a table currently doesn't do 
> anything when querying that data. It needs to be implemented to support 
> various datasets that ends lines with other characters then just line break.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-524) ExecDriver adds 0 byte file to input paths

2009-09-16 Thread Johan Oskarsson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Johan Oskarsson updated HIVE-524:
-

Affects Version/s: (was: 0.4.0)
   0.5.0

> ExecDriver adds 0 byte file to input paths
> --
>
> Key: HIVE-524
> URL: https://issues.apache.org/jira/browse/HIVE-524
> Project: Hadoop Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.5.0
>Reporter: Johan Oskarsson
> Fix For: 0.4.0
>
>
> In the addInputPaths method in ExecDriver:
> If the input path of a partition cannot be found or contains no files with 
> data in them, a 0 byte file is created and added to the job instead. This 
> causes our custom InputFormat to throw an exception since it is asked to 
> process an unknown file format (not an lzo file).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.