from:"Min Zhou"

Re: [VOTE] hive release candidate 0.4.1-rc0

2009-11-01 Thread Min Zhou

I think there may be a bug still in this release.

hiveselect stuff_status from auctions where auction_id='2591238417'
and pt='20091027';

auctions is a table partitioned by date, it stored as a textfile w/o
compression. The query above should return 0 rows.
but when hive.exec.compress.output=true,  hive will crash with a
StackOverflowError

java.lang.StackOverflowError
at java.lang.ref.FinalReference.init(FinalReference.java:16)
at java.lang.ref.Finalizer.init(Finalizer.java:66)
at java.lang.ref.Finalizer.register(Finalizer.java:72)
at java.lang.Object.init(Object.java:20)
at java.net.SocketImpl.init(SocketImpl.java:27)
at java.net.PlainSocketImpl.init(PlainSocketImpl.java:90)
at java.net.SocksSocketImpl.init(SocksSocketImpl.java:33)
at java.net.Socket.setImpl(Socket.java:434)
at java.net.Socket.init(Socket.java:68)
at sun.nio.ch.SocketAdaptor.init(SocketAdaptor.java:50)
at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
at 
org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
at java.io.DataInputStream.read(DataInputStream.java:132)
at 
org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
at 
org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
at 
org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
at java.io.InputStream.read(InputStream.java:85)
at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
at 
org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
at 
org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)

Each mapper will produce a 8 bytes deflate file on hdfs(we set
hive.merge.mapfiles=false), their hex representation  is like below:

78 9C 03 00 00 00 00 01

This is the reason why FetchOperator:272 is called recursively, and
caused a stack overflow error.

Regards,
Min


On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao zsh...@gmail.com wrote:
 I have made a release candidate 0.4.1-rc0.

 We've fixed several critical bugs to hive release 0.4.0. We need hive
 release 0.4.1 out asap.

 Here are the list of changes:

    HIVE-884. Metastore Server should call System.exit() on error.
    (Zheng Shao via pchakka)

    HIVE-864. Fix map-join memory-leak.
    (Namit Jain via zshao)

    HIVE-878. Update the hash table entry before flushing in Group By
    hash aggregation (Zheng Shao via namit)

    HIVE-882. Create a new directory every time for scratch.
    (Namit Jain via zshao)

    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)

    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via zshao)

    HIVE-883. URISyntaxException when partition value contains special chars.
    (Zheng Shao via namit)


 Please vote.

 --
 Yours,
 Zheng




-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: [VOTE] hive release candidate 0.4.1-rc0

2009-11-01 Thread Min Zhou

we use zip codec in default.
Some of the same lines were omitted from the error stack:
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)


Thanks,
Min

On Mon, Nov 2, 2009 at 2:57 PM, Zheng Shao zsh...@gmail.com wrote:
 Min, can you check the default compression codec in your hadoop conf?
 The 8-byte file must be a compressed file using the codec which
 represents 0-length file.

 It seems that codec was not able to decompress the stream.

 Zheng

 On Sun, Nov 1, 2009 at 10:49 PM, Min Zhou coderp...@gmail.com wrote:
 I think there may be a bug still in this release.

 hiveselect stuff_status from auctions where auction_id='2591238417'
 and pt='20091027';

 auctions is a table partitioned by date, it stored as a textfile w/o
 compression. The query above should return 0 rows.
 but when hive.exec.compress.output=true,  hive will crash with a
 StackOverflowError

 java.lang.StackOverflowError
        at java.lang.ref.FinalReference.init(FinalReference.java:16)
        at java.lang.ref.Finalizer.init(Finalizer.java:66)
        at java.lang.ref.Finalizer.register(Finalizer.java:72)
        at java.lang.Object.init(Object.java:20)
        at java.net.SocketImpl.init(SocketImpl.java:27)
        at java.net.PlainSocketImpl.init(PlainSocketImpl.java:90)
        at java.net.SocksSocketImpl.init(SocksSocketImpl.java:33)
        at java.net.Socket.setImpl(Socket.java:434)
        at java.net.Socket.init(Socket.java:68)
        at sun.nio.ch.SocketAdaptor.init(SocketAdaptor.java:50)
        at sun.nio.ch.SocketAdaptor.create(SocketAdaptor.java:55)
        at sun.nio.ch.SocketChannelImpl.socket(SocketChannelImpl.java:105)
        at 
 org.apache.hadoop.net.StandardSocketFactory.createSocket(StandardSocketFactory.java:58)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1540)
        at 
 org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1662)
        at java.io.DataInputStream.read(DataInputStream.java:132)
        at 
 org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:96)
        at 
 org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:86)
        at 
 org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:74)
        at java.io.InputStream.read(InputStream.java:85)
        at org.apache.hadoop.util.LineReader.backfill(LineReader.java:82)
        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:112)
        at 
 org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:134)
        at 
 org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:39)
        at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:256)
        at 
 org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:272)

 Each mapper will produce a 8 bytes deflate file on hdfs(we set
 hive.merge.mapfiles=false), their hex representation  is like below:

 78 9C 03 00 00 00 00 01

 This is the reason why FetchOperator:272 is called recursively, and
 caused a stack overflow error.

 Regards,
 Min


 On Mon, Nov 2, 2009 at 6:34 AM, Zheng Shao zsh...@gmail.com wrote:
 I have made a release candidate 0.4.1-rc0.

 We've fixed several critical bugs to hive release 0.4.0. We need hive
 release 0.4.1 out asap.

 Here are the list of changes:

    HIVE-884. Metastore Server should call System.exit() on error.
    (Zheng Shao via pchakka)

    HIVE-864. Fix map-join memory-leak.
    (Namit Jain via zshao)

    HIVE-878. Update the hash table entry before flushing in Group By
    hash aggregation (Zheng Shao via namit)

    HIVE-882. Create a new directory every time for scratch.
    (Namit Jain via zshao)

    HIVE-890. Fix cli.sh for detecting Hadoop versions. (Paul Huff via zshao)

    HIVE-892. Hive to kill hadoop jobs using POST. (Dhruba Borthakur via 
 zshao)

    HIVE-883. URISyntaxException when partition value contains special chars.
    (Zheng Shao via namit)


 Please vote.

 --
 Yours,
 Zheng




 --
 My research interests are distributed systems, parallel computing and
 bytecode based virtual machine.

 My profile:
 http://www.linkedin.com/in/coderplay
 My blog:
 http://coderplay.javaeye.com




 --
 Yours,
 Zheng




-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

[jira] Commented: (HIVE-842) Authentication Infrastructure for Hive

2009-10-13 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12765362#action_12765362
 ] 

Min Zhou commented on HIVE-842:
---

@Edward

Kerberos for authethication is a good way I think,  user/password is no need 
here.  This issue would be implemented in the future.
btw, we've finished the development of authorization infrastructure for Hive.  

 Authentication Infrastructure for Hive
 --

 Key: HIVE-842
 URL: https://issues.apache.org/jira/browse/HIVE-842
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Edward Capriolo

 This issue deals with the authentication (user name,password) infrastructure. 
 Not the authorization components that specify what a user should be able to 
 do.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [VOTE] vote for release candidate for hive

2009-09-29 Thread Min Zhou

I saw it, +1 for all test passed .

On Wed, Sep 30, 2009 at 1:59 AM, Namit Jain nj...@facebook.com wrote:

 I did find the files:

 [nj...@dev029 /tmp]$ ls -lrt hive-0.4.0-dev-hadoop-0.19.0/src
 total 33580
 drwxr-xr-x  4 njain users 4096 Aug 11 16:41 docs
 drwxr-xr-x  7 njain users 4096 Aug 11 16:41 data
 -rw-r--r--  1 njain users15675 Aug 11 16:41 README.txt
 -rw-r--r--  1 njain users 2810 Sep  2 10:44 TestTruncate.launch
 -rw-r--r--  1 njain users 2804 Sep  2 10:44 TestMTQueries.launch
 -rw-r--r--  1 njain users 2807 Sep  2 10:44 TestJdbc.launch
 -rw-r--r--  1 njain users 2808 Sep  2 10:44 TestHive.launch
 -rw-r--r--  1 njain users 2805 Sep  2 10:44 TestCliDriver.launch
 -rw-r--r--  1 njain users17045 Sep 10 15:16 build.xml
 -rw-r--r--  1 njain users  850 Sep 10 15:16 build.properties
 -rw-r--r--  1 njain users12520 Sep 10 15:16 build-common.xml
 -rw-r--r--  1 njain users33431 Sep 17 18:15 CHANGES.txt
 -rw-r--r--  1 njain users 1071 Sep 18 13:26 runscr
 -rw-r--r--  1 njain users 23392371 Sep 18 13:26
 hive-0.4.0-hadoop-0.20.0-dev.tar.gz
 -rw-r--r--  1 njain users 10735695 Sep 18 13:27
 hive-0.4.0-hadoop-0.20.0-bin.tar.gz
 drwxr-xr-x  3 njain users 4096 Sep 29 10:54 jdbc
 drwxr-xr-x  2 njain users 4096 Sep 29 10:54 ivy
 drwxr-xr-x  4 njain users 4096 Sep 29 10:54 hwi
 drwxr-xr-x  4 njain users 4096 Sep 29 10:54 eclipse-templates
 drwxr-xr-x  3 njain users 4096 Sep 29 10:54 contrib
 drwxr-xr-x  2 njain users 4096 Sep 29 10:54 conf
 drwxr-xr-x  3 njain users 4096 Sep 29 10:54 common
 drwxr-xr-x  4 njain users 4096 Sep 29 10:54 cli
 drwxr-xr-x  3 njain users 4096 Sep 29 10:54 ant
 drwxr-xr-x  2 njain users 4096 Sep 29 10:54 testutils
 drwxr-xr-x  2 njain users 4096 Sep 29 10:54 testlibs
 drwxr-xr-x  3 njain users 4096 Sep 29 10:54 shims
 drwxr-xr-x  6 njain users 4096 Sep 29 10:54 service
 drwxr-xr-x  4 njain users 4096 Sep 29 10:54 serde
 drwxr-xr-x  5 njain users 4096 Sep 29 10:54 ql
 drwxr-xr-x  4 njain users 4096 Sep 29 10:54 odbc
 drwxr-xr-x  6 njain users 4096 Sep 29 10:54 metastore
 drwxr-xr-x  2 njain users 4096 Sep 29 10:54 lib
 drwxr-xr-x  3 njain users 4096 Sep 29 10:54 bin



 I have attached the output.




 -Original Message-
 From: Min Zhou [mailto:coderp...@gmail.com]
 Sent: Tuesday, September 22, 2009 6:29 PM
 To: hive-dev@hadoop.apache.org
 Subject: Re: [VOTE] vote for release candidate for hive

 Hi Namit

 I meant

 http://people.apache.org/~namit/hive-0.4.0-candidate-2/hive-0.4.0-hadoop-0.19.0-dev.tar.gzhttp://people.apache.org/%7Enamit/hive-0.4.0-candidate-2/hive-0.4.0-hadoop-0.19.0-dev.tar.gz

 Min

 On Wed, Sep 23, 2009 at 5:31 AM, Namit Jain nj...@facebook.com wrote:

  Which one are you looking at ?
 
  I downloaded just now from:
 
 
 
 http://people.apache.org/~namit/hive-0.4.0-candidate-2/hive-0.4.0-hadoop-0.20.0-dev.tar.gzhttp://people.apache.org/%7Enamit/hive-0.4.0-candidate-2/hive-0.4.0-hadoop-0.20.0-dev.tar.gz
 
 http://people.apache.org/%7Enamit/hive-0.4.0-candidate-2/hive-0.4.0-hadoop-0.20.0-dev.tar.gz
 
 
  and it contains CHANGE.txt and build.xml etc.
 
  Did you download the binary tarball ?
 
  Thanks,
  -namit
 
 
 
  -Original Message-
  From: Min Zhou [mailto:coderp...@gmail.com]
  Sent: Monday, September 21, 2009 7:46 PM
  To: hive-dev@hadoop.apache.org
  Subject: Re: [VOTE] vote for release candidate for hive
 
  Hi Namit,
 
  I haven't found build.xml, CHANGES.txt from your tarball. They must be
  included so that we can test it and check the changes, I think.
 
  Thanks,
  Min
 
  On Sat, Sep 19, 2009 at 4:42 AM, Namit Jain nj...@facebook.com wrote:
 
   It is available from
  
   http://people.apache.org/~namit/ http://people.apache.org/%7Enamit/
 http://people.apache.org/%7Enamit/ 
  http://people.apache.org/%7Enamit/
  
  
   Thanks,
   -namit
  
   -Original Message-
   From: Ashish Thusoo
   Sent: Thursday, September 17, 2009 11:55 PM
   To: hive-dev@hadoop.apache.org; Namit Jain
   Subject: RE: [VOTE] vote for release candidate for hive
  
   Namit,
  
   Can you make it available from
  
   http://people.apache.org/~njain/ http://people.apache.org/%7Enjain/
 http://people.apache.org/%7Enjain/ 
  http://people.apache.org/%7Enjain/
  
   That way people who do not have access to the apache machines will also
  be
   able to try the candidate.
  
   Thanks,
   Ashish
   
   From: Namit Jain [nj...@facebook.com]
   Sent: Thursday, September 17, 2009 6:32 PM
   To: Namit Jain; hive-dev@hadoop.apache.org
   Subject: [VOTE] vote for release candidate for hive
  
   Following the convention
  
   -Original Message-
   From: Namit Jain
   Sent: Thursday, September 17, 2009 6:31 PM
   To: hive-dev@hadoop.apache.org
   Subject: vote for release candidate for hive
  
   I have created another release candidate for Hive.
  
https://svn.apache.org/repos/asf/hadoop

we can not pass unit test of trunk.

2009-09-22 Thread Min Zhou

Hi all,

below is a failure :

   [junit] Begin query: input41.q
[junit] plan = /tmp/plan37765.xml
[junit] plan = /tmp/plan37766.xml
[junit] plan = /tmp/plan37767.xml
[junit] plan = /tmp/plan37768.xml
[junit] diff -a -I \(file:\)\|\(/tmp/.*\) -I lastUpdateTime -I
lastAccessTime -I owner /home/hivetest/hive-trunk/build/ql/test/l
ogs/clientpositive/input41.q.out
/home/hivetest/hive-trunk/ql/src/test/results/clientpositive/input41.q.out
[junit] 7,8c7
[junit]  Output:
file:/home/hivetest/hive-trunk/build/ql/tmp/1868499757/1
[junit]  0
[junit] ---
[junit]  Output:
file:/data/users/njain/hive1/hive1/build/ql/tmp/607183026/1
[junit] 9a9
[junit]  0
[junit] Exception: Client execution results failed with error code = 1
[junit] junit.framework.AssertionFailedError: Client execution results
failed with error code = 1
[junit] at junit.framework.Assert.fail(Assert.java:47)
[junit] at
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input41(TestCliDriver.java:5010)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
[junit] at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at junit.framework.TestCase.runTest(TestCase.java:154)
[junit] at junit.framework.TestCase.runBare(TestCase.java:127)
[junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
[junit] at
junit.framework.TestResult.runProtected(TestResult.java:124)
[junit] at junit.framework.TestResult.run(TestResult.java:109)
[junit] at junit.framework.TestCase.run(TestCase.java:118)
[junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
[junit] at junit.framework.TestSuite.run(TestSuite.java:203)
[junit] at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
[junit] at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
[junit] at
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)



 $cat test-0.19.0.log  | grep Failures
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.297 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.637 sec
[junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.777 sec
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.446 sec
[junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.323 sec
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.422 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.308 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.364 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.354 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.363 sec
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.389 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.379 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.321 sec
[junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 13.234 sec
[junit] Tests run: 338, Failures: 3, Errors: 0, Time elapsed: 3,955.436
sec
[junit] Tests run: 75, Failures: 0, Errors: 0, Time elapsed: 208.786 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 45.511 sec
[junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 36.822 sec
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 14.556 sec
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 7.721 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.117 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 12.42 sec
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.038 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.055 sec
[junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.869 sec
[junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 7.706 sec
[junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 9.301 sec
[junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.338 sec
[junit] Tests run: 44, Failures: 0, Errors: 0, Time elapsed: 155.071 sec
[junit] Tests run: 33, Failures: 0, Errors: 0, Time elapsed: 108.604 sec
[junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.588 sec

we runned test with command
  ant test -Dhadoop.version=0.19.0


Thanks,
Min
-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: [VOTE] vote for release candidate for hive

2009-09-22 Thread Min Zhou

Hi Namit

I meant
http://people.apache.org/~namit/hive-0.4.0-candidate-2/hive-0.4.0-hadoop-0.19.0-dev.tar.gz

Min

On Wed, Sep 23, 2009 at 5:31 AM, Namit Jain nj...@facebook.com wrote:

 Which one are you looking at ?

 I downloaded just now from:


 http://people.apache.org/~namit/hive-0.4.0-candidate-2/hive-0.4.0-hadoop-0.20.0-dev.tar.gzhttp://people.apache.org/%7Enamit/hive-0.4.0-candidate-2/hive-0.4.0-hadoop-0.20.0-dev.tar.gz

 and it contains CHANGE.txt and build.xml etc.

 Did you download the binary tarball ?

 Thanks,
 -namit



 -Original Message-
 From: Min Zhou [mailto:coderp...@gmail.com]
 Sent: Monday, September 21, 2009 7:46 PM
 To: hive-dev@hadoop.apache.org
 Subject: Re: [VOTE] vote for release candidate for hive

 Hi Namit,

 I haven't found build.xml, CHANGES.txt from your tarball. They must be
 included so that we can test it and check the changes, I think.

 Thanks,
 Min

 On Sat, Sep 19, 2009 at 4:42 AM, Namit Jain nj...@facebook.com wrote:

  It is available from
 
  http://people.apache.org/~namit/ http://people.apache.org/%7Enamit/ 
 http://people.apache.org/%7Enamit/
 
 
  Thanks,
  -namit
 
  -Original Message-
  From: Ashish Thusoo
  Sent: Thursday, September 17, 2009 11:55 PM
  To: hive-dev@hadoop.apache.org; Namit Jain
  Subject: RE: [VOTE] vote for release candidate for hive
 
  Namit,
 
  Can you make it available from
 
  http://people.apache.org/~njain/ http://people.apache.org/%7Enjain/ 
 http://people.apache.org/%7Enjain/
 
  That way people who do not have access to the apache machines will also
 be
  able to try the candidate.
 
  Thanks,
  Ashish
  
  From: Namit Jain [nj...@facebook.com]
  Sent: Thursday, September 17, 2009 6:32 PM
  To: Namit Jain; hive-dev@hadoop.apache.org
  Subject: [VOTE] vote for release candidate for hive
 
  Following the convention
 
  -Original Message-
  From: Namit Jain
  Sent: Thursday, September 17, 2009 6:31 PM
  To: hive-dev@hadoop.apache.org
  Subject: vote for release candidate for hive
 
  I have created another release candidate for Hive.
 
   https://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.4.0-rc2/
 
  Let me know if it is OK to publish this release candidate.
 
 
 
  The only change from the previous candidate (
  https://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.4.0-rc1/) is
  the fix for
 
   https://issues.apache.org/jira/browse/HIVE-838
 
 
  The tar ball can be found at:
 
  people.apache.org
 
  /home/namit/public_html/hive-0.4.0-candidate-2/hive-0.4.0-dev.tar.gz*
 
 
 
  Thanks,
  -namit
 
 
 
 
 


 --
 My research interests are distributed systems, parallel computing and
 bytecode based virtual machine.

 My profile:
 http://www.linkedin.com/in/coderplay
 My blog:
 http://coderplay.javaeye.com




-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: we can not pass unit test of trunk.

2009-09-22 Thread Min Zhou

We now use ant test -Dhadoop.version=0.19.0 -Doverwrite=true, it passed.
Can anyone give me an explanation?

Thanks,
Min

On Wed, Sep 23, 2009 at 9:23 AM, Min Zhou coderp...@gmail.com wrote:

 Hi all,

 below is a failure :

[junit] Begin query: input41.q
 [junit] plan = /tmp/plan37765.xml
 [junit] plan = /tmp/plan37766.xml
 [junit] plan = /tmp/plan37767.xml
 [junit] plan = /tmp/plan37768.xml
 [junit] diff -a -I \(file:\)\|\(/tmp/.*\) -I lastUpdateTime -I
 lastAccessTime -I owner /home/hivetest/hive-trunk/build/ql/test/l
 ogs/clientpositive/input41.q.out
 /home/hivetest/hive-trunk/ql/src/test/results/clientpositive/input41.q.out
 [junit] 7,8c7
 [junit]  Output:
 file:/home/hivetest/hive-trunk/build/ql/tmp/1868499757/1
 [junit]  0
 [junit] ---
 [junit]  Output:
 file:/data/users/njain/hive1/hive1/build/ql/tmp/607183026/1
 [junit] 9a9
 [junit]  0
 [junit] Exception: Client execution results failed with error code = 1
 [junit] junit.framework.AssertionFailedError: Client execution results
 failed with error code = 1
 [junit] at junit.framework.Assert.fail(Assert.java:47)
 [junit] at
 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input41(TestCliDriver.java:5010)
 [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
 Method)
 [junit] at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 [junit] at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 [junit] at java.lang.reflect.Method.invoke(Method.java:597)
 [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
 [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
 [junit] at
 junit.framework.TestResult$1.protect(TestResult.java:106)
 [junit] at
 junit.framework.TestResult.runProtected(TestResult.java:124)
 [junit] at junit.framework.TestResult.run(TestResult.java:109)
 [junit] at junit.framework.TestCase.run(TestCase.java:118)
 [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
 [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
 [junit] at
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
 [junit] at
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
 [junit] at
 org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)



  $cat test-0.19.0.log  | grep Failures
 [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.297 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.637 sec
 [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.777 sec
 [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.446 sec
 [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.323 sec
 [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.422 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.308 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.364 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.354 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.363 sec
 [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.389 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.379 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.321 sec
 [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 13.234 sec
 [junit] Tests run: 338, Failures: 3, Errors: 0, Time elapsed: 3,955.436
 sec
 [junit] Tests run: 75, Failures: 0, Errors: 0, Time elapsed: 208.786
 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 45.511 sec
 [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 36.822 sec
 [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 14.556 sec
 [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 7.721 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.117 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 12.42 sec
 [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.038 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.055 sec
 [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.869 sec
 [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 7.706 sec
 [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 9.301 sec
 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.338 sec
 [junit] Tests run: 44, Failures: 0, Errors: 0, Time elapsed: 155.071
 sec
 [junit] Tests run: 33, Failures: 0, Errors: 0, Time elapsed: 108.604
 sec
 [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.588 sec

 we runned test with command
   ant test

Re: [VOTE] vote for release candidate for hive

2009-09-21 Thread Min Zhou

Hi Namit,

I haven't found build.xml, CHANGES.txt from your tarball. They must be
included so that we can test it and check the changes, I think.

Thanks,
Min

On Sat, Sep 19, 2009 at 4:42 AM, Namit Jain nj...@facebook.com wrote:

 It is available from

 http://people.apache.org/~namit/ http://people.apache.org/%7Enamit/


 Thanks,
 -namit

 -Original Message-
 From: Ashish Thusoo
 Sent: Thursday, September 17, 2009 11:55 PM
 To: hive-dev@hadoop.apache.org; Namit Jain
 Subject: RE: [VOTE] vote for release candidate for hive

 Namit,

 Can you make it available from

 http://people.apache.org/~njain/ http://people.apache.org/%7Enjain/

 That way people who do not have access to the apache machines will also be
 able to try the candidate.

 Thanks,
 Ashish
 
 From: Namit Jain [nj...@facebook.com]
 Sent: Thursday, September 17, 2009 6:32 PM
 To: Namit Jain; hive-dev@hadoop.apache.org
 Subject: [VOTE] vote for release candidate for hive

 Following the convention

 -Original Message-
 From: Namit Jain
 Sent: Thursday, September 17, 2009 6:31 PM
 To: hive-dev@hadoop.apache.org
 Subject: vote for release candidate for hive

 I have created another release candidate for Hive.

  https://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.4.0-rc2/

 Let me know if it is OK to publish this release candidate.



 The only change from the previous candidate (
 https://svn.apache.org/repos/asf/hadoop/hive/tags/release-0.4.0-rc1/) is
 the fix for

  https://issues.apache.org/jira/browse/HIVE-838


 The tar ball can be found at:

 people.apache.org

 /home/namit/public_html/hive-0.4.0-candidate-2/hive-0.4.0-dev.tar.gz*



 Thanks,
 -namit







-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

[jira] Commented: (HIVE-78) Authorization infrastructure for Hive

2009-09-21 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12758112#action_12758112
 ] 

Min Zhou commented on HIVE-78:
--

@Namit

Got your meaning.  We are maintaining a version of our own, it needs couples of 
weeks for adapting  to the trunk.

 Authorization infrastructure for Hive
 -

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, 
 hive-78-syntax-v1.patch, hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-78) Authorization infrastructure for Hive

2009-09-18 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757616#action_12757616
 ] 

Min Zhou commented on HIVE-78:
--

sorry, 
{nofromat}
public class GenericAuthenticator extends  Authenticator {
  public GenericAuthenticator (Hive db, User user);
   ...
}
{nofromat}

 Authorization infrastructure for Hive
 -

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, 
 hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-78) Authorization infrastructure for Hive

2009-09-18 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12757622#action_12757622
 ] 

Min Zhou commented on HIVE-78:
--

oops,  my code wasn't in my machine. I just pasted yours and modified it into 
mine. 
here is a patch show my code on that.


 Authorization infrastructure for Hive
 -

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, 
 hive-78-syntax-v1.patch, hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-78) Authorization infrastructure for Hive

2009-09-18 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-78:
-

Attachment: createuser-v1.patch

 Authorization infrastructure for Hive
 -

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: createuser-v1.patch, hive-78-metadata-v1.patch, 
 hive-78-syntax-v1.patch, hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-78) Authentication infrastructure for Hive

2009-09-17 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756904#action_12756904
 ] 

Min Zhou commented on HIVE-78:
--

Let me guess, you are all talking about CLI. But we are using HiveServer as a 
multi-user server, not just support only one user  like mysqld does.

 Authentication infrastructure for Hive
 --

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, 
 hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-78) Authentication infrastructure for Hive

2009-09-17 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756949#action_12756949
 ] 

Min Zhou commented on HIVE-78:
--

I do not think the HiveServer in your mind is the same as mine, which support 
multiple users, not only one.

 Authentication infrastructure for Hive
 --

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, 
 hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-78) Authentication infrastructure for Hive

2009-09-17 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756951#action_12756951
 ] 

Min Zhou commented on HIVE-78:
--

From the words you commented:
{noformat}
Daemons like HiveService and HiveWebInterface will have to run as supergroup or 
a hive group? 
{noformat}

 Authentication infrastructure for Hive
 --

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, 
 hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-78) Authentication infrastructure for Hive

2009-09-16 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-78:
-

Attachment: hive-78-metadata-v1.patch

 Authentication infrastructure for Hive
 --

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch, 
 hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-78) Authentication infrastructure for Hive

2009-09-16 Thread Min Zhou (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756335#action_12756335
]

Min Zhou commented on HIVE-78:
--

@Edward

Sorry for my abuse of some words, I hope this will not affect our work.

Can you give me the jiras you decided not to store username/password
information in hive and hadoop will?
I think most companies are using hadoop versions from 0.17 to 0.20 , which
don't have good password securities. Once a company takes a particular
version, upgrades for them is a very important issue, many companies will adopt
a more stable version. Moreover, now hadoop still do not have that feature,
which may cost a very long time to implement. Why should we are waiting for,
rather than accomplish it? I think Hive is necessary to support user/password
at least for current versions of hadoop. There are many companies who are using
hive reflected that current hive is inconvenient for multi-user, as long as
environment isolation, table sharing, security, etc. We must try to meet the
requirements of most of them.

Regarding the syntax, I guess we can do it in two steps.
# support GRANT/REVOKE privileges to users.
# support some sort of server administration privileges as Ashish metioned.
The GRANT statement enables system administrators to create Hive user accounts
and to grant rights to accounts. To use GRANT, you must have the GRANT OPTION
privilege, and you must have the privileges that you are grantingad. The REVOKE
statement is related and enables ministrators to remove account privileges.

File hive-78-syntax-v1.patch modifies the syntax. Any comments on that?

Authentication infrastructure for Hive
--

Key: HIVE-78
URL: https://issues.apache.org/jira/browse/HIVE-78
Project: Hadoop Hive
Issue Type: New Feature
Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
Attachments: hive-78-metadata-v1.patch, hive-78-syntax-v1.patch,
hive-78.diff

Allow hive to integrate with existing user repositories for authentication
and authorization infromation.

--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-78) Authentication infrastructure for Hive

2009-09-15 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755876#action_12755876
 ] 

Min Zhou commented on HIVE-78:
--

we will take over this issue, it would be finished in two weeks.  Here are the 
sql statements will be added:
{noformat}
CREATE USER, 
DROP USER;
ALTER USER SET PASSOWRD;
GRANT;
REVOKE
{noformat}

Metadata is stored at some sort of persistent media such as mysql DBMS through 
jdo.  We will add three tables for this issue, they are USER, DBS_PRIV, 
TABLES_PRIV. Privileges can be granted at several levels, each table above are 
corresponding to a privilege level. 
#  Global level
Global privileges apply to all databases on a given server. These privileges 
are stored in the USER table. GRANT ALL ON *.* and REVOKE ALL ON *.* grant and 
revoke only global privileges. 
GRANT ALL ON *.* TO 'someuser';
GRANT SELECT, INSERT ON *.* TO 'someuser';

#  Database level
Database privileges apply to all objects in a given database. These privileges 
are stored in the DBS_PRIV table. GRANT ALL ON db_name.* and REVOKE ALL ON 
db_name.* grant and revoke only database privileges. 
GRANT ALL ON mydb.* TO 'someuser';
GRANT SELECT, INSERT ON mydb.* TO 'someuser';
Although we can't create DBs currently,  it would take a reserved place till 
hive support.

# Table level
Table privileges apply to all columns in a given table. These privileges are 
stored in the TABLES_PRIV table. GRANT ALL ON db_name.tbl_name and REVOKE ALL 
ON db_name.tbl_name grant and revoke only table privileges. 
GRANT ALL ON mydb.mytbl TO 'someuser';
GRANT SELECT, INSERT ON mydb.mytbl TO 'someuser';

Hive account information is stored in USER table, includes username, password 
and kinds of privileges. User who has been granted any privilege to, such as 
select/insert/drop on a particular table, always have a right to show that 
table.



 Authentication infrastructure for Hive
 --

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-78) Authentication infrastructure for Hive

2009-09-15 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-78:
-

Attachment: hive-78-syntax-v1.patch

 Authentication infrastructure for Hive
 --

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: hive-78-syntax-v1.patch, hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-78) Authentication infrastructure for Hive

2009-09-15 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-78?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12755882#action_12755882
 ] 

Min Zhou commented on HIVE-78:
--

 We currently use seperated mysql dbs for achieving an isolated CLI 
environment, which is not practical. An authentication infrastructure is 
urgently needed for us.

Almost all statements would be influenced, for example
SELECT 
INSERT
SHOW TABLES
SHOW PARTITIONS
DESCRIBE TABLE
MSCK
CREATE TABLE
CREATE FUNCTION -- we are considering how to control people creating udfs.
DROP TABLE
DROP FUNCTION
LOAD
added with GRANT/REVOKE themselft, and CREATE USER/DROP USER/SET PASSWORD. Even 
includes some non-sql commands like set , add file ,add jar. 


 Authentication infrastructure for Hive
 --

 Key: HIVE-78
 URL: https://issues.apache.org/jira/browse/HIVE-78
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Reporter: Ashish Thusoo
Assignee: Edward Capriolo
 Attachments: hive-78-syntax-v1.patch, hive-78.diff


 Allow hive to integrate with existing user repositories for authentication 
 and authorization infromation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-818) Create a Hive CLI that connects to hive ThriftServer

2009-09-08 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12752852#action_12752852
 ] 

Min Zhou commented on HIVE-818:
---

this feature looks pretty good for us, we were looking for a CLI mode client of 
hive server.

 Create a Hive CLI that connects to hive ThriftServer
 

 Key: HIVE-818
 URL: https://issues.apache.org/jira/browse/HIVE-818
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Clients, Server Infrastructure
Reporter: Edward Capriolo
Assignee: Edward Capriolo

 We should have an alternate CLI that works by interacting with the 
 HiveServer, in this way it will be ready when/if we deprecate the current CLI.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-814) exception alter a int typed column to date/datetime/timestamp

2009-09-03 Thread Min Zhou (JIRA)

exception alter a int typed column to date/datetime/timestamp
-

 Key: HIVE-814
 URL: https://issues.apache.org/jira/browse/HIVE-814
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Min Zhou


As fas as i know, time types can only be used in partitions,  normal columns is 
not allowed to be set as those types . But it's found can alter a no time types 
column to date/datetime/timestamp,but exceptions will be throwed when 
describing.

hive create table pokes(foo int, bar string);
OK
Time taken: 0.894 seconds
hive alter table pokes replace columns(foo date, bar string);
OK
Time taken: 0.266 seconds

hive describe pokes;  
FAILED: Error in metadata: 
MetaException(message:java.lang.IllegalArgumentException Error: type expected 
at the position 0 of 'date:string' but 'date' is found.)
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: [VOTE] Branching and releasing for 0.4.0

2009-08-11 Thread Min Zhou

+1
all tests pass on my machine.

Min

On Sat, Aug 8, 2009 at 5:57 PM, Amr Awadallah a...@cloudera.com wrote:

 +1


 Namit Jain wrote:

 +1



 On 8/6/09 5:16 PM, Zheng Shao zsh...@gmail.com wrote:

 +1

 On Thu, Aug 6, 2009 at 5:10 PM, Ashish Thusooathu...@facebook.com
 wrote:


 Hi Folks,

 The following is a proposed schedule for 0.4.0 branching. Please vote on
 it by tomorrow and say whether you are ok with it or not.

 8/7/2009 - Branch out 0.4.0 (branching to happen by mid night)
 8/14/2009 - Code freeze on 0.4.0
 8/28/2009 - Release 0.4.0

 Thanks,
 Ashish






 --
 Yours,
 Zheng







-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

[jira] Commented: (HIVE-607) Create statistical UDFs.

2009-07-29 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736473#action_12736473
 ] 

Min Zhou commented on HIVE-607:
---

@Namit
I implemented group_cat() in a rush, and found something difficult slove: 
1. function group_cat() has a internal order by clause, currently, we can't 
such aggregation in hive. 
2. when the string will be group concated is too large, in another is appears 
data skew, there is ofen not  enough memory to store such a big string.

 Create statistical UDFs.
 

 Key: HIVE-607
 URL: https://issues.apache.org/jira/browse/HIVE-607
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: S. Alex Smith
Assignee: Emil Ibrishimov
Priority: Minor
 Fix For: 0.4.0

 Attachments: HIVE-607.1.patch, UDAFStddev.java


 Create UDFs replicating:
 STD() Return the population standard deviation
 STDDEV_POP()(v5.0.3)  Return the population standard deviation
 STDDEV_SAMP()(v5.0.3) Return the sample standard deviation
 STDDEV()  Return the population standard deviation
 SUM() Return the sum
 VAR_POP()(v5.0.3) Return the population standard variance
 VAR_SAMP()(v5.0.3)Return the sample variance
 VARIANCE()(v4.1)  Return the population standard variance
 as found at http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-607) Create statistical UDFs.

2009-07-29 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736475#action_12736475
 ] 

Min Zhou commented on HIVE-607:
---

sorry, some typo

@Namit
I've implemented group_cat() in a rush, and found something difficult to slove:
1. function group_cat() has a internal order by clause, currently, we can't 
implement such an aggregation in hive.
2. when the strings will be group concated are too large, in another words, if 
data skew appears,  there is ofen not enough memory to store such a big result.

 Create statistical UDFs.
 

 Key: HIVE-607
 URL: https://issues.apache.org/jira/browse/HIVE-607
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: S. Alex Smith
Assignee: Emil Ibrishimov
Priority: Minor
 Fix For: 0.4.0

 Attachments: HIVE-607.1.patch, UDAFStddev.java


 Create UDFs replicating:
 STD() Return the population standard deviation
 STDDEV_POP()(v5.0.3)  Return the population standard deviation
 STDDEV_SAMP()(v5.0.3) Return the sample standard deviation
 STDDEV()  Return the population standard deviation
 SUM() Return the sum
 VAR_POP()(v5.0.3) Return the population standard variance
 VAR_SAMP()(v5.0.3)Return the sample variance
 VARIANCE()(v4.1)  Return the population standard variance
 as found at http://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-702) DROP TEMPORARY FUNCTION should not drop builtin functions

2009-07-29 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-702:
--

Attachment: HIVE-702.1.patch

patch

 DROP TEMPORARY FUNCTION should not drop builtin functions
 -

 Key: HIVE-702
 URL: https://issues.apache.org/jira/browse/HIVE-702
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Min Zhou
 Attachments: HIVE-702.1.patch


 Only temporary functions should be dropped. It should error out if the user 
 tries to drop built-in functions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-702) DROP TEMPORARY FUNCTION should not drop builtin functions

2009-07-29 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736762#action_12736762
 ] 

Min Zhou commented on HIVE-702:
---

pls wait a moment, I haven't deal with the conflict you mentioned.

 DROP TEMPORARY FUNCTION should not drop builtin functions
 -

 Key: HIVE-702
 URL: https://issues.apache.org/jira/browse/HIVE-702
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Min Zhou
 Attachments: HIVE-702.1.patch


 Only temporary functions should be dropped. It should error out if the user 
 tries to drop built-in functions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-702) DROP TEMPORARY FUNCTION should not drop builtin functions

2009-07-29 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-702:
--

Attachment: HIVE-702.2.patch

done

 DROP TEMPORARY FUNCTION should not drop builtin functions
 -

 Key: HIVE-702
 URL: https://issues.apache.org/jira/browse/HIVE-702
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Min Zhou
 Attachments: HIVE-702.1.patch, HIVE-702.2.patch


 Only temporary functions should be dropped. It should error out if the user 
 tries to drop built-in functions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-702) DROP TEMPORARY FUNCTION should not drop builtin functions

2009-07-29 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736767#action_12736767
 ] 

Min Zhou commented on HIVE-702:
---

that patch hasn't been tested, cuz I stay at home, can not connect to the 
company's vpn.

 DROP TEMPORARY FUNCTION should not drop builtin functions
 -

 Key: HIVE-702
 URL: https://issues.apache.org/jira/browse/HIVE-702
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Min Zhou
 Attachments: HIVE-702.1.patch, HIVE-702.2.patch


 Only temporary functions should be dropped. It should error out if the user 
 tries to drop built-in functions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-700) Fix test error by adding DROP FUNCTION

2009-07-28 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou reassigned HIVE-700:
-

Assignee: Min Zhou

 Fix test error by adding DROP FUNCTION
 

 Key: HIVE-700
 URL: https://issues.apache.org/jira/browse/HIVE-700
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Zheng Shao
Assignee: Min Zhou

 Since we added Show Functions in HIVE-580, test results will depend on what 
 temporary functions are added to the system.
 We should add the capability of DROP FUNCTION, and do that at the end of 
 those create function tests to make sure the show functions results are 
 deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-649) [UDF] now() for getting current time

2009-07-28 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-649:
--

Attachment: HIVE-649.patch

patch

 [UDF] now() for getting current time
 

 Key: HIVE-649
 URL: https://issues.apache.org/jira/browse/HIVE-649
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Min Zhou
 Attachments: HIVE-649.patch


 http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_now

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-700) Fix test error by adding DROP FUNCTION

2009-07-28 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-700:
--

Attachment: HIVE-700.1.patch

usage: 
drop function function_name

 Fix test error by adding DROP FUNCTION
 

 Key: HIVE-700
 URL: https://issues.apache.org/jira/browse/HIVE-700
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Zheng Shao
Assignee: Min Zhou
 Attachments: HIVE-700.1.patch


 Since we added Show Functions in HIVE-580, test results will depend on what 
 temporary functions are added to the system.
 We should add the capability of DROP FUNCTION, and do that at the end of 
 those create function tests to make sure the show functions results are 
 deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-700) Fix test error by adding DROP FUNCTION

2009-07-28 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736435#action_12736435
 ] 

Min Zhou commented on HIVE-700:
---

Sorry for my late. we have a training today, I will update a new patch for 
hive-700 related jiras.

 Fix test error by adding DROP FUNCTION
 

 Key: HIVE-700
 URL: https://issues.apache.org/jira/browse/HIVE-700
 Project: Hadoop Hive
  Issue Type: Bug
Reporter: Zheng Shao
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-700.1.patch, hive.700.2.patch


 Since we added Show Functions in HIVE-580, test results will depend on what 
 temporary functions are added to the system.
 We should add the capability of DROP FUNCTION, and do that at the end of 
 those create function tests to make sure the show functions results are 
 deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-702) DROP TEMPORARY FUNCTION should not drop builtin functions

2009-07-28 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou reassigned HIVE-702:
-

Assignee: Min Zhou

 DROP TEMPORARY FUNCTION should not drop builtin functions
 -

 Key: HIVE-702
 URL: https://issues.apache.org/jira/browse/HIVE-702
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Min Zhou

 Only temporary functions should be dropped. It should error out if the user 
 tries to drop built-in functions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-642) udf equivalent to string split

2009-07-21 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733641#action_12733641
 ] 

Min Zhou commented on HIVE-642:
---

It's very useful for us . 
some comments:
# Can you implement it directly with Text ? Avoiding string  decoding and 
encoding would be faster.  Of course that trick may lead to another problem, as 
String.split uses a regular expression for splitting.
# getDisplayString() always return a  string in lowercase. 

 udf equivalent to string split
 --

 Key: HIVE-642
 URL: https://issues.apache.org/jira/browse/HIVE-642
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Emil Ibrishimov
 Fix For: 0.4.0

 Attachments: HIVE-642.1.patch, HIVE-642.2.patch


 It would be very useful to have a function equivalent to string split in java

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-599) Embedded Hive SQL into Python

2009-07-20 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733429#action_12733429
 ] 

Min Zhou commented on HIVE-599:
---

I agree with Namit and Yongqiang.   I was thinking about creating function with 
a format like below:
{noformat}
create function function_name (arguments list ) as python {
python udf code
} 

create function function_name (arguments list ) as java{
java udf code
} 
{noformat}
we can dynamiclly compile those kinds of code above, use jython  
com.sun.tools.javac respectively.

It's better store python  or  java udf byte code into the persistent metastore 
typically mysql after creation.  We can call that function again w/o a second 
function creation.


 Embedded Hive SQL into Python
 -

 Key: HIVE-599
 URL: https://issues.apache.org/jira/browse/HIVE-599
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Ashish Thusoo
Assignee: Ashish Thusoo

 While Hive does SQL it would be very powerful to be able to embed that SQL in 
 languages like python in such a way that the hive query is also able to 
 invoke python functions seemlessly. One possibility is to explore integration 
 with Dumbo. Another is to see if the internal map_reduce.py tool can be open 
 sourced as a Hive contrib.
 Other thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)

2009-07-19 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733015#action_12733015
 ] 

Min Zhou commented on HIVE-512:
---

can you answer me about this queries?


 [GenericUDF] new string function ELT(N,str1,str2,str3,...) 
 ---

 Key: HIVE-512
 URL: https://issues.apache.org/jira/browse/HIVE-512
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-512.2.patch, HIVE-512.patch


 ELT(N,str1,str2,str3,...)
 Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less 
 than 1 or greater than the number of arguments. ELT() is the complement of 
 FIELD().
 {noformat}
 mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
 - 'ej'
 mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
 - 'foo'
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)

2009-07-19 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733016#action_12733016
 ] 

Min Zhou commented on HIVE-512:
---

select(1, '2', 3)
select(2, '2', 3)
select(1, true, 3)
select(2, 2.0, cast(3 as double))

if we don't uniformly return strings, it would be confused to user detemining 
which type will return.

 [GenericUDF] new string function ELT(N,str1,str2,str3,...) 
 ---

 Key: HIVE-512
 URL: https://issues.apache.org/jira/browse/HIVE-512
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-512.2.patch, HIVE-512.patch


 ELT(N,str1,str2,str3,...)
 Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less 
 than 1 or greater than the number of arguments. ELT() is the complement of 
 FIELD().
 {noformat}
 mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
 - 'ej'
 mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
 - 'foo'
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)

2009-07-19 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733103#action_12733103
 ] 

Min Zhou commented on HIVE-512:
---

If you inspected the implementation of case, you will know it's unacceptable to 
case with different types of arguments.
see: GenericUDFCase.java , GenericUDFWhen.java 
{code}
hive select case when true then '2' else 3 end from pokes limit 1;
FAILED: Error in semantic analysis: line 1:36 Argument Type Mismatch 3: The 
expression after ELSE should have the same type as those after THEN: string 
is expected but int is found
{code}

elt is a string function, confusion will be caused if we casually change its 
behavior. It no need make things more complex.

 [GenericUDF] new string function ELT(N,str1,str2,str3,...) 
 ---

 Key: HIVE-512
 URL: https://issues.apache.org/jira/browse/HIVE-512
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-512.2.patch, HIVE-512.patch


 ELT(N,str1,str2,str3,...)
 Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less 
 than 1 or greater than the number of arguments. ELT() is the complement of 
 FIELD().
 {noformat}
 mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
 - 'ej'
 mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
 - 'foo'
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

It's strange two if clause with the same logics

2009-07-19 Thread Min Zhou

Hi,
see ObjectInspectorConverters.java:78
  case STRING:
if (outputOI instanceof WritableStringObjectInspector) {
  return new PrimitiveObjectInspectorConverter.TextConverter(
  (PrimitiveObjectInspector)inputOI);
} else if  (outputOI instanceof WritableStringObjectInspector) {
  return new PrimitiveObjectInspectorConverter.TextConverter(
  (PrimitiveObjectInspector)inputOI);
}

the second clause has the same logic with the first one. I guess there is
something wrong, maybe one is for WritableStringObjectInspector, the other
is for JavaStringObjectInspector.

Min
-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

[jira] Commented: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)

2009-07-18 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732911#action_12732911
 ] 

Min Zhou commented on HIVE-512:
---

Here is the definition of elt:  Return string at index number. 
It's essentially a string function
select elt(1, 2, 3) will return a varbinary in mysql, rather than a int. I 
still insist returning string is better. 

Even if do it as you said,  what  type of results will  return when doing 
queries like below?

select(1, '2', 3)
select(2, '2', 3)
select(1, true, 3)
select(2, 2.0, cast(3 as double))



 [GenericUDF] new string function ELT(N,str1,str2,str3,...) 
 ---

 Key: HIVE-512
 URL: https://issues.apache.org/jira/browse/HIVE-512
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-512.2.patch, HIVE-512.patch


 ELT(N,str1,str2,str3,...)
 Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less 
 than 1 or greater than the number of arguments. ELT() is the complement of 
 FIELD().
 {noformat}
 mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
 - 'ej'
 mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
 - 'foo'
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)

2009-07-17 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732828#action_12732828
 ] 

Min Zhou commented on HIVE-512:
---

actually, elt return only two types of results in mysql : varbinary, varchar. 
varchar will be returned if all arguments are varchars, or varbinarys will be 
returned. 

mysql create table t3 as select elt(1, 'a',  3);
Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql describe t3;   
+-+--+--+-+-+---+
| Field   | Type | Null | Key | Default | Extra |
+-+--+--+-+-+---+
| elt(1, 'a',  3) | varbinary(1) | YES  | | NULL|   |
+-+--+--+-+-+---+
1 row in set (0.00 sec)

mysql create table t4 as select elt(1, true,  false); 
Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql describe t4;  
+--+--+--+-+-+---+
| Field| Type | Null | Key | Default | Extra |
+--+--+--+-+-+---+
| elt(1, true,  false) | varbinary(1) | YES  | | NULL|   |
+--+--+--+-+-+---+
1 row in set (0.00 sec)


mysql create table t5 as select elt(1, 2.0,  false); 
Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql describe t5;  
+-+--+--+-+-+---+
| Field   | Type | Null | Key | Default | Extra |
+-+--+--+-+-+---+
| elt(1, 2.0,  false) | varbinary(4) | YES  | | NULL|   |
+-+--+--+-+-+---+
1 row in set (0.00 sec)


Based on the above, I think it better return string as binary is commonly used  
in hive. 


 [GenericUDF] new string function ELT(N,str1,str2,str3,...) 
 ---

 Key: HIVE-512
 URL: https://issues.apache.org/jira/browse/HIVE-512
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-512.2.patch, HIVE-512.patch


 ELT(N,str1,str2,str3,...)
 Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less 
 than 1 or greater than the number of arguments. ELT() is the complement of 
 FIELD().
 {noformat}
 mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
 - 'ej'
 mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
 - 'foo'
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-541) Implement UDFs: INSTR and LOCATE

2009-07-16 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12731858#action_12731858
 ] 

Min Zhou commented on HIVE-541:
---

all test cases passed on my side,  how's  yours?

 Implement UDFs: INSTR and LOCATE
 

 Key: HIVE-541
 URL: https://issues.apache.org/jira/browse/HIVE-541
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.4.0
Reporter: Zheng Shao
Assignee: Min Zhou
 Attachments: HIVE-541.1.patch, HIVE-541.2.patch


 http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
 http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
 These functions can be directly implemented with Text (instead of String). 
 This will make the test of whether one string contains another string much 
 faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (HIVE-515) [UDF] new string function INSTR(str,substr)

2009-07-16 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou resolved HIVE-515.
---

Resolution: Duplicate

duplicates [#HIVE-541]

 [UDF] new string function INSTR(str,substr)
 ---

 Key: HIVE-515
 URL: https://issues.apache.org/jira/browse/HIVE-515
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Min Zhou
Assignee: Min Zhou
 Attachments: HIVE-515-2.patch, HIVE-515.patch


 UDF for string function INSTR(str,substr)
 This extends the function from MySQL
 http://dev.mysql.com/doc/refman/5.1/en/string-functions.html#function_instr
 usage:
  INSTR(str, substr)
  INSTR(str, substr, start)
 example:
 {code:sql}
 select instr('abcd', 'abc') from pokes;  // all result are '1'
 select instr('abcabc', 'ccc') from pokes;  // all result are '0'
 select instr('abcabc', 'abc', 2) from pokes;  // all result are '4'
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-649) [UDF] now() for getting current time

2009-07-16 Thread Min Zhou (JIRA)

[UDF] now() for getting current time


 Key: HIVE-649
 URL: https://issues.apache.org/jira/browse/HIVE-649
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Min Zhou


http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_now

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-541) Implement UDFs: INSTR and LOCATE

2009-07-15 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12731764#action_12731764
 ] 

Min Zhou commented on HIVE-541:
---

hmm, It's may be a good way. I will try it soon. 

 Implement UDFs: INSTR and LOCATE
 

 Key: HIVE-541
 URL: https://issues.apache.org/jira/browse/HIVE-541
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.4.0
Reporter: Zheng Shao
Assignee: Min Zhou
 Attachments: HIVE-541.1.patch


 http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
 http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
 These functions can be directly implemented with Text (instead of String). 
 This will make the test of whether one string contains another string much 
 faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-541) Implement UDFs: INSTR and LOCATE

2009-07-15 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-541:
--

Attachment: HIVE-541.2.patch

Added a GenericUDFUtils.findText() where string encoding and decoding is 
avoided, faster execution will be gained.  

 Implement UDFs: INSTR and LOCATE
 

 Key: HIVE-541
 URL: https://issues.apache.org/jira/browse/HIVE-541
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.4.0
Reporter: Zheng Shao
Assignee: Min Zhou
 Attachments: HIVE-541.1.patch, HIVE-541.2.patch


 http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_instr
 http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_locate
 These functions can be directly implemented with Text (instead of String). 
 This will make the test of whether one string contains another string much 
 faster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-329) start and stop hive thrift server in daemon mode

2009-07-13 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou reassigned HIVE-329:
-

Assignee: Min Zhou

 start and stop hive thrift server  in daemon mode
 -

 Key: HIVE-329
 URL: https://issues.apache.org/jira/browse/HIVE-329
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Affects Versions: 0.3.0
Reporter: Min Zhou
Assignee: Min Zhou
 Attachments: daemon.patch


 I write two shell script to start and stop hive thrift server more convenient.
 usage:
 bin/hive --service start-hive [HIVE_PORT]
 bin/hive --service stop-hive 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.

2009-07-12 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-555:
--

Attachment: HIVE-555-4.patch

Add a copy of UDAF to avoid 
[HIVE-620|http://issues.apache.org/jira/browse/HIVE-620] for passing all test 
cases.

 create temporary function support not only udf, but also udaf,  genericudf, 
 etc.
 

 Key: HIVE-555
 URL: https://issues.apache.org/jira/browse/HIVE-555
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-555-1.patch, HIVE-555-2.patch, HIVE-555-3.patch, 
 HIVE-555-4.patch


 Right now, command 'create temporary function' only support  udf. 
 we can also let user write their udaf, generic udf, and write generic udaf in 
 the future. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Some wish after serious consideration

2009-07-09 Thread Min Zhou

Hi all,
Having focused on hive for several month, here is some wish of me after
serious consideration

   1. All auto-gen code for hive was under the facebook commercial version
   of thrift, which is older than the open source one, would lead to lots of
   compatible problems and stop from all helps from the open source community.
   We need to remove  them as soon as possible, but it seems the progress on
   this issue has stopped.
   2. Please give us a clear roadmap. We also have a plan improving hive,
   but our patches would probably be uncared-for, because it's not on the
   schedule of facebook. If  go on like this, there should be a lot of
   compatible problems brought by other commits, we were surfing from fixing
   conflicts again and again.
   3. Please don't commit code so rashly. Code from Ashish could easily be
   committed by others without a strict  examination, that caused a lot of
   problems when using it here, bugs and incondite code hard to read and to
   extend it. Perhaps the main reason is that Ashish is the leader of Hive.
   Another person, Namit, always committed buggy or ugly code.  I have a
   suggestion,  just more discussion and tests with the helps of open source
   community. Code quality would be raised if so. (I don't intend to in the
   personal attacks here)

Regards,
Min
-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

[jira] Created: (HIVE-618) More human-readable error prompt of FunctionTask

2009-07-09 Thread Min Zhou (JIRA)

More human-readable error prompt of FunctionTask


 Key: HIVE-618
 URL: https://issues.apache.org/jira/browse/HIVE-618
 Project: Hadoop Hive
  Issue Type: Improvement
Reporter: Min Zhou


current prompt:
{noformat}
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.FunctionTask
{noformat}
Zheng suggested that somethings like below would be better
{noformat}
Class  not found
Class  does not implement UDF, GenericUDF, or UDAF
{noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Some wish after serious consideration

2009-07-09 Thread Min Zhou

I have been watching HIVE-438 for so long a time, you know that's a critical
change almost impact the whole hive source tree, a quick resolve is need.
It's understandable human resources of facebook are very nervous,
developers there always join several projects at the same time. Therefore,
we should use the power of the open source community to speed up the
development of it.  But right now, my feeling is that we only care about
their own affairs, regardless of what other people do. This is not the
pattern of the open source community, but we still immersed in this pattern.




2009/7/9 He Yongqiang heyongqi...@software.ict.ac.cn

 But I think no conflicts can be guaranteed, since conflicts are not raised
 by one patch. If no conflict appears to this patch, then there will be
 conflicts for other patches.


My meaning was not refuse conflicts, there should be a way to avoid the
frequency of them if we work together.


 I mean

 But I think no conflicts can NOT be guaranteed, since conflicts are not
 raised by one patch. If no conflict appears to this patch, then there will
 be conflicts for other patches.


 On 09-7-9 下午2:43, He Yongqiang heyongqi...@software.ict.ac.cn wrote:

  On 09-7-9 下午2:14, Min Zhou coderp...@gmail.com wrote:
 
  Hi all,
  Having focused on hive for several month, here is some wish of me after
  serious consideration
 
 1. All auto-gen code for hive was under the facebook commercial
 version
 of thrift, which is older than the open source one, would lead to
 lots of
 compatible problems and stop from all helps from the open source
  community.
 We need to remove  them as soon as possible, but it seems the
 progress on
 this issue has stopped.
 
  See Hive-438. I think it will be committed by this weekend?
 
 2. Please give us a clear roadmap. We also have a plan improving
 hive,
 but our patches would probably be uncared-for, because it's not on
 the
 schedule of facebook. If  go on like this, there should be a lot of
 compatible problems brought by other commits, we were surfing from
 fixing
 conflicts again and again.
 
  I think the hive roadmap on hive wiki page has just been updated.
  Please send out request for code review if you think the patch is ready.
  But I think no conflicts can be guaranteed, since conflicts are not
 raised
  by one patch. If no conflict appears to this patch, then there will be
  conflicts for other patches.
 
 3. Please don't commit code so rashly. Code from Ashish could easily
 be
 committed by others without a strict  examination, that caused a lot
 of
 problems when using it here, bugs and incondite code hard to read and
 to
 extend it. Perhaps the main reason is that Ashish is the leader of
 Hive.
 Another person, Namit, always committed buggy or ugly code.  I have a
 suggestion,  just more discussion and tests with the helps of open
 source
 community. Code quality would be raised if so. (I don't intend to in
 the
 personal attacks here)
 
  Code review is a kind of really hard and boring work. And we can only say
  that the code is much likely with no error. A patch is committed with at
  least two persons' work, the patch submitter and the code reviewer.
  Sometimes the code is really hard to find errors either by eyes or tests,
 so
  please be more patient. The bugs can be fixed soon after observing.
 
  And I agree with you suggestion on more discussion, so please comment on
 the
  jira pages for issues you think need more discussion and tests.
  BTW, I think as the hive community grows, there could be more
 discussions.
  So the first priority issue should be how to enlarge the hive community,
 and
  let more people involved in the discussion of the hive mail-list or jira.
 
 
  Regards,
  Min
 
 
 
 





-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

[jira] Work started: (HIVE-512) [GenericUDF] new string function ELT(N,str1,str2,str3,...)

2009-07-09 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-512 started by Min Zhou.

 [GenericUDF] new string function ELT(N,str1,str2,str3,...) 
 ---

 Key: HIVE-512
 URL: https://issues.apache.org/jira/browse/HIVE-512
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-512.patch


 ELT(N,str1,str2,str3,...)
 Returns str1 if N = 1, str2 if N = 2, and so on. Returns NULL if N is less 
 than 1 or greater than the number of arguments. ELT() is the complement of 
 FIELD().
 {noformat}
 mysql SELECT ELT(1, 'ej', 'Heja', 'hej', 'foo');
 - 'ej'
 mysql SELECT ELT(4, 'ej', 'Heja', 'hej', 'foo');
 - 'foo'
 {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.

2009-07-08 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-555:
--

Attachment: HIVE-555-2.patch

with unit tests.

 create temporary function support not only udf, but also udaf,  genericudf, 
 etc.
 

 Key: HIVE-555
 URL: https://issues.apache.org/jira/browse/HIVE-555
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-555-1.patch, HIVE-555-2.patch


 Right now, command 'create temporary function' only support  udf. 
 we can also let user write their udaf, generic udf, and write generic udaf in 
 the future. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.

2009-07-08 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12728989#action_12728989
 ] 

Min Zhou commented on HIVE-555:
---

1. I thought it would be a common function for generic udf  error prompt. 
2. It that required for an existing generic udf? but regardless whatever, i'll 
do it.


 create temporary function support not only udf, but also udaf,  genericudf, 
 etc.
 

 Key: HIVE-555
 URL: https://issues.apache.org/jira/browse/HIVE-555
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-555-1.patch, HIVE-555-2.patch


 Right now, command 'create temporary function' only support  udf. 
 we can also let user write their udaf, generic udf, and write generic udaf in 
 the future. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.

2009-07-08 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-555:
--

Attachment: HIVE-555-3.patch

patch followed namit's comments.

 create temporary function support not only udf, but also udaf,  genericudf, 
 etc.
 

 Key: HIVE-555
 URL: https://issues.apache.org/jira/browse/HIVE-555
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-555-1.patch, HIVE-555-2.patch, HIVE-555-3.patch


 Right now, command 'create temporary function' only support  udf. 
 we can also let user write their udaf, generic udf, and write generic udaf in 
 the future. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-555) create temporary function support not only udf, but also udaf, genericudf, etc.

2009-07-08 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12729009#action_12729009
 ] 

Min Zhou commented on HIVE-555:
---

@Zheng
It would involved some logic out of the FuctionTask. Actually , execute methods 
of  all  Task classes is  defined to return an integer stand for status code. 
So create another jira for that issue is better. Agree?


 create temporary function support not only udf, but also udaf,  genericudf, 
 etc.
 

 Key: HIVE-555
 URL: https://issues.apache.org/jira/browse/HIVE-555
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-555-1.patch, HIVE-555-2.patch, HIVE-555-3.patch


 Right now, command 'create temporary function' only support  udf. 
 we can also let user write their udaf, generic udf, and write generic udaf in 
 the future. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-329) start and stop hive thrift server in daemon mode

2009-07-08 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12729012#action_12729012
 ] 

Min Zhou commented on HIVE-329:
---

start need a port number, but stop needn't.

 start and stop hive thrift server  in daemon mode
 -

 Key: HIVE-329
 URL: https://issues.apache.org/jira/browse/HIVE-329
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Server Infrastructure
Affects Versions: 0.3.0
Reporter: Min Zhou
 Attachments: daemon.patch


 I write two shell script to start and stop hive thrift server more convenient.
 usage:
 bin/hive --service start-hive [HIVE_PORT]
 bin/hive --service stop-hive 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)

2009-07-07 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12727999#action_12727999
 ] 

Min Zhou commented on HIVE-537:
---

Zheng, how would you get field value from an object without a ordinal?


 Hive TypeInfo/ObjectInspector to support union (besides struct, array, and 
 map)
 ---

 Key: HIVE-537
 URL: https://issues.apache.org/jira/browse/HIVE-537
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Zheng Shao
Assignee: Min Zhou
 Attachments: HIVE-537.1.patch


 There are already some cases inside the code that we use heterogeneous data: 
 JoinOperator, and UnionOperator (in the sense that different parents can pass 
 in records with different ObjectInspectors).
 We currently use Operator's parentID to distinguish that. However that 
 approach does not extend to more complex plans that might be needed in the 
 future.
 We will support the union type like this:
 {code}
 TypeDefinition:
   type: primitivetype | structtype | arraytype | maptype | uniontype
   uniontype: union  tag : type (, tag : type)* 
 Example:
   union0:int,1:double,2:arraystring,3:structa:int,b:string
 Example of serialized data format:
   We will first store the tag byte before we serialize the object. On 
 deserialization, we will first read out the tag byte, then we know what is 
 the current type of the following object, so we can deserialize it 
 successfully.
 Interface for ObjectInspector:
 interface UnionObjectInspector {
   /** Returns the array of OIs that are for each of the tags
*/
   ObjectInspector[] getObjectInspectors();
   /** Return the tag of the object.
*/
   byte getTag(Object o);
   /** Return the field based on the tag value associated with the Object.
*/
   Object getField(Object o);
 };
 An example serialization format (Using deliminated format, with ' ' as 
 first-level delimitor and '=' as second-level delimitor)
 userid:int,log:union0:structtouserid:int,message:string,1:string
 123 1=login
 123 0=243=helloworld
 123 1=logout
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)

2009-07-05 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-537:
--

Attachment: HIVE-537.1.patch

HIVE-537.1.patch

 Hive TypeInfo/ObjectInspector to support union (besides struct, array, and 
 map)
 ---

 Key: HIVE-537
 URL: https://issues.apache.org/jira/browse/HIVE-537
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: HIVE-537.1.patch


 There are already some cases inside the code that we use heterogeneous data: 
 JoinOperator, and UnionOperator (in the sense that different parents can pass 
 in records with different ObjectInspectors).
 We currently use Operator's parentID to distinguish that. However that 
 approach does not extend to more complex plans that might be needed in the 
 future.
 We will support the union type like this:
 {code}
 TypeDefinition:
   type: primitivetype | structtype | arraytype | maptype | uniontype
   uniontype: union  tag : type (, tag : type)* 
 Example:
   union0:int,1:double,2:arraystring,3:structa:int,b:string
 Example of serialized data format:
   We will first store the tag byte before we serialize the object. On 
 deserialization, we will first read out the tag byte, then we know what is 
 the current type of the following object, so we can deserialize it 
 successfully.
 Interface for ObjectInspector:
 interface UnionObjectInspector {
   /** Returns the array of OIs that are for each of the tags
*/
   ObjectInspector[] getObjectInspectors();
   /** Return the tag of the object.
*/
   byte getTag(Object o);
   /** Return the field based on the tag value associated with the Object.
*/
   Object getField(Object o);
 };
 An example serialization format (Using deliminated format, with ' ' as 
 first-level delimitor and '=' as second-level delimitor)
 userid:int,log:union0:structtouserid:int,message:string,1:string
 123 1=login
 123 0=243=helloworld
 123 1=logout
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)

2009-06-30 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12725532#action_12725532
 ] 

Min Zhou commented on HIVE-537:
---

Even if UnionObjectInspector has been implemented,  the DynamicSerDe seems 
don't support  the schema with a union type  which thrift can't recoginze.
We must find a way solving it, any suggestions?  

 Hive TypeInfo/ObjectInspector to support union (besides struct, array, and 
 map)
 ---

 Key: HIVE-537
 URL: https://issues.apache.org/jira/browse/HIVE-537
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Zheng Shao
Assignee: Zheng Shao

 There are already some cases inside the code that we use heterogeneous data: 
 JoinOperator, and UnionOperator (in the sense that different parents can pass 
 in records with different ObjectInspectors).
 We currently use Operator's parentID to distinguish that. However that 
 approach does not extend to more complex plans that might be needed in the 
 future.
 We will support the union type like this:
 {code}
 TypeDefinition:
   type: primitivetype | structtype | arraytype | maptype | uniontype
   uniontype: union  tag : type (, tag : type)* 
 Example:
   union0:int,1:double,2:arraystring,3:structa:int,b:string
 Example of serialized data format:
   We will first store the tag byte before we serialize the object. On 
 deserialization, we will first read out the tag byte, then we know what is 
 the current type of the following object, so we can deserialize it 
 successfully.
 Interface for ObjectInspector:
 interface UnionObjectInspector {
   /** Returns the array of OIs that are for each of the tags
*/
   ObjectInspector[] getObjectInspectors();
   /** Return the tag of the object.
*/
   byte getTag(Object o);
   /** Return the field based on the tag value associated with the Object.
*/
   Object getField(Object o);
 };
 An example serialization format (Using deliminated format, with ' ' as 
 first-level delimitor and '=' as second-level delimitor)
 userid:int,log:union0:structtouserid:int,message:string,1:string
 123 1=login
 123 0=243=helloworld
 123 1=logout
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields

2009-06-30 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12725564#action_12725564
 ] 

Min Zhou commented on HIVE-577:
---

Passed all testcase in hadoop 0.17.0 -0.19.1.   

 return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
 

 Key: HIVE-577
 URL: https://issues.apache.org/jira/browse/HIVE-577
 Project: Hadoop Hive
  Issue Type: Sub-task
Reporter: Min Zhou
Assignee: Min Zhou
 Attachments: HIVE-577.1.patch, HIVE-577.2.patch


 comment of each column hasnot been retrieved correct right now , 
 FieldSchema.getComment() will return a string from derserializer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields

2009-06-29 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-577:
--

Attachment: HIVE-577.1.patch

can retrieve all columns' comments now.

 return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
 

 Key: HIVE-577
 URL: https://issues.apache.org/jira/browse/HIVE-577
 Project: Hadoop Hive
  Issue Type: Sub-task
Reporter: Min Zhou
Assignee: Min Zhou
 Attachments: HIVE-577.1.patch


 comment of each column hasnot been retrieved correct right now , 
 FieldSchema.getComment() will return a string from derserializer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields

2009-06-29 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-577:
--

Attachment: HIVE-577.2.patch

@Prasad 
I considered that case you mentioned before uploaded a that patch,  just didn't 
know what is the meaning of code. 

this patch would cope the issue.

 return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
 

 Key: HIVE-577
 URL: https://issues.apache.org/jira/browse/HIVE-577
 Project: Hadoop Hive
  Issue Type: Sub-task
Reporter: Min Zhou
Assignee: Min Zhou
 Attachments: HIVE-577.1.patch, HIVE-577.2.patch


 comment of each column hasnot been retrieved correct right now , 
 FieldSchema.getComment() will return a string from derserializer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields

2009-06-29 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12725450#action_12725450
 ] 

Min Zhou commented on HIVE-577:
---

I guessed it's cumbersome to deal with custom tables from current api provided 
by hive currently. 
ddl for schema should changed from 
  struct{ type1 col1, type2 col2}
to some format like
  struct{ struct{type1 col1, string comment1},  struct{type2 col2, string 
comment2}}

however, MetaStoreUtils.getDDLFromFieldSchema(structName,  fieldSchemas) is not 
only for getSchema(table). 



 return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
 

 Key: HIVE-577
 URL: https://issues.apache.org/jira/browse/HIVE-577
 Project: Hadoop Hive
  Issue Type: Sub-task
Reporter: Min Zhou
Assignee: Min Zhou
 Attachments: HIVE-577.1.patch, HIVE-577.2.patch


 comment of each column hasnot been retrieved correct right now , 
 FieldSchema.getComment() will return a string from derserializer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields

2009-06-29 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12725450#action_12725450
 ] 

Min Zhou edited comment on HIVE-577 at 6/29/09 8:15 PM:


I guessed it's cumbersome to deal with custom tables from the api provided by 
hive currently. 

DDL for table schema should changed from 
  struct{ type1 col1, type2 col2}
to some format like
  struct{ struct{type1 col1, string comment1},  struct{type2 col2, string 
comment2}}

however, MetaStoreUtils.getDDLFromFieldSchema(structName,  fieldSchemas) is not 
only for getSchema(table). 



  was (Author: coderplay):
I guessed it's cumbersome to deal with custom tables from current api 
provided by hive currently. 
ddl for schema should changed from 
  struct{ type1 col1, type2 col2}
to some format like
  struct{ struct{type1 col1, string comment1},  struct{type2 col2, string 
comment2}}

however, MetaStoreUtils.getDDLFromFieldSchema(structName,  fieldSchemas) is not 
only for getSchema(table). 


  
 return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
 

 Key: HIVE-577
 URL: https://issues.apache.org/jira/browse/HIVE-577
 Project: Hadoop Hive
  Issue Type: Sub-task
Reporter: Min Zhou
Assignee: Min Zhou
 Attachments: HIVE-577.1.patch, HIVE-577.2.patch


 comment of each column hasnot been retrieved correct right now , 
 FieldSchema.getComment() will return a string from derserializer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields

2009-06-29 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12725473#action_12725473
 ] 

Min Zhou commented on HIVE-577:
---

Any suggestions on this or accepting the 2nd patch, Prasad?

 return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
 

 Key: HIVE-577
 URL: https://issues.apache.org/jira/browse/HIVE-577
 Project: Hadoop Hive
  Issue Type: Sub-task
Reporter: Min Zhou
Assignee: Min Zhou
 Attachments: HIVE-577.1.patch, HIVE-577.2.patch


 comment of each column hasnot been retrieved correct right now , 
 FieldSchema.getComment() will return a string from derserializer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-537) Hive TypeInfo/ObjectInspector to support union (besides struct, array, and map)

2009-06-27 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12724916#action_12724916
 ] 

Min Zhou commented on HIVE-537:
---

we've done a test about this issue, dataset: 700m records.

first approach, each distinct count needs 119 seconds, that's means 10 distinct 
count needs at least  1190 seconds.
second approach where distinct keys were distinguished by a tag,  10 distinct 
count need 148 seconds.

 Hive TypeInfo/ObjectInspector to support union (besides struct, array, and 
 map)
 ---

 Key: HIVE-537
 URL: https://issues.apache.org/jira/browse/HIVE-537
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Zheng Shao
Assignee: Zheng Shao

 There are already some cases inside the code that we use heterogeneous data: 
 JoinOperator, and UnionOperator (in the sense that different parents can pass 
 in records with different ObjectInspectors).
 We currently use Operator's parentID to distinguish that. However that 
 approach does not extend to more complex plans that might be needed in the 
 future.
 We will support the union type like this:
 {code}
 TypeDefinition:
   type: primitivetype | structtype | arraytype | maptype | uniontype
   uniontype: union  tag : type (, tag : type)* 
 Example:
   union0:int,1:double,2:arraystring,3:structa:int,b:string
 Example of serialized data format:
   We will first store the tag byte before we serialize the object. On 
 deserialization, we will first read out the tag byte, then we know what is 
 the current type of the following object, so we can deserialize it 
 successfully.
 Interface for ObjectInspector:
 interface UnionObjectInspector {
   /** Returns the array of OIs that are for each of the tags
*/
   ObjectInspector[] getObjectInspectors();
   /** Return the tag of the object.
*/
   byte getTag(Object o);
   /** Return the field based on the tag value associated with the Object.
*/
   Object getField(Object o);
 };
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-576) complete jdbc driver

2009-06-25 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12724060#action_12724060
 ] 

Min Zhou commented on HIVE-576:
---

Dones  To Dos :

# removed all useless comments auto-gened by eclipse.
# added APL statements for each file
# fixed a bug SemanticAnalyzer.getSchema() fails after doing select-all queries 
on tables have partitions, where queries like select * from tbl where 
partition_name=value
# implemented HiveResultSetMetadata, HiveDatabaseMetadata
# HiveResultSet supported getXXX(columnName) now
# removed JdbcSessionState hasnot been used 
# supported SQL Explorer for manipulate hive data by GUI
# todo: implement HivePreparedStatement  HiveCallableStatement

 complete jdbc driver
 

 Key: HIVE-576
 URL: https://issues.apache.org/jira/browse/HIVE-576
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Min Zhou
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-576.1.patch, HIVE-576.2.patch, sqlexplorer.jpg


 hive only support a few interfaces of jdbc, let's complete it. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer

2009-06-24 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-567:
--

Attachment: (was: tables.jpg)

 jdbc: integrate hive with pentaho report designer
 -

 Key: HIVE-567
 URL: https://issues.apache.org/jira/browse/HIVE-567
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
 Fix For: 0.4.0

 Attachments: hive-567-server-output.txt, hive-567.1.patch, 
 hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz


 Instead of trying to get a complete implementation of jdbc, its probably more 
 useful to pick reporting/analytics software out there and implement the jdbc 
 methods necessary to get them working. This jira is a first attempt at this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer

2009-06-24 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-567:
--

Attachment: (was: sqlexplorer.jpg)

 jdbc: integrate hive with pentaho report designer
 -

 Key: HIVE-567
 URL: https://issues.apache.org/jira/browse/HIVE-567
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
 Fix For: 0.4.0

 Attachments: hive-567-server-output.txt, hive-567.1.patch, 
 hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz


 Instead of trying to get a complete implementation of jdbc, its probably more 
 useful to pick reporting/analytics software out there and implement the jdbc 
 methods necessary to get them working. This jira is a first attempt at this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-576) complete jdbc driver

2009-06-24 Thread Min Zhou (JIRA)

complete jdbc driver


 Key: HIVE-576
 URL: https://issues.apache.org/jira/browse/HIVE-576
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Min Zhou
 Fix For: 0.4.0


hive only support a few interfaces of jdbc, let's complete it. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-567) jdbc: integrate hive with pentaho report designer

2009-06-24 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723526#action_12723526
 ] 

Min Zhou commented on HIVE-567:
---

It's not elegant getting schema from hiveserver by the means of adding a 
function getFullDDLFromFieldSchema.

 jdbc: integrate hive with pentaho report designer
 -

 Key: HIVE-567
 URL: https://issues.apache.org/jira/browse/HIVE-567
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
 Fix For: 0.4.0

 Attachments: hive-567-server-output.txt, hive-567.1.patch, 
 hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz


 Instead of trying to get a complete implementation of jdbc, its probably more 
 useful to pick reporting/analytics software out there and implement the jdbc 
 methods necessary to get them working. This jira is a first attempt at this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields

2009-06-24 Thread Min Zhou (JIRA)

return correct comment of a column from ThriftHiveMetastore.Iface.get_fields


 Key: HIVE-577
 URL: https://issues.apache.org/jira/browse/HIVE-577
 Project: Hadoop Hive
  Issue Type: Sub-task
Reporter: Min Zhou


comment of each column hasnot been retrieved correct right now , 
FieldSchema.getComment() will return a string from derserializer.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-573) TestHiveServer broken

2009-06-24 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723841#action_12723841
 ] 

Min Zhou commented on HIVE-573:
---

it's a good way use json through Avro here, but making things more complex.  
serde(although is not a rpc), thrift, avro, 3 duplications of works.  

 TestHiveServer broken
 -

 Key: HIVE-573
 URL: https://issues.apache.org/jira/browse/HIVE-573
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Server Infrastructure
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
 Fix For: 0.4.0

 Attachments: hive-573.1.patch


 This was after the change to HIVE-567 was committed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Work started: (HIVE-577) return correct comment of a column from ThriftHiveMetastore.Iface.get_fields

2009-06-24 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-577 started by Min Zhou.

 return correct comment of a column from ThriftHiveMetastore.Iface.get_fields
 

 Key: HIVE-577
 URL: https://issues.apache.org/jira/browse/HIVE-577
 Project: Hadoop Hive
  Issue Type: Sub-task
Reporter: Min Zhou
Assignee: Min Zhou

 comment of each column hasnot been retrieved correct right now , 
 FieldSchema.getComment() will return a string from derserializer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

2009-06-24 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723874#action_12723874
 ] 

Min Zhou commented on HIVE-574:
---

+1 for Zheng, thanks
It worked fine here, nothing abnormal.

 Hive should use ClassLoader from hadoop Configuration
 -

 Key: HIVE-574
 URL: https://issues.apache.org/jira/browse/HIVE-574
 Project: Hadoop Hive
  Issue Type: Bug
Affects Versions: 0.3.0, 0.3.1
Reporter: Zheng Shao
Assignee: Zheng Shao
 Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch


 See HIVE-338.
 Hive should always use the getClassByName method from hadoop Configuration, 
 so that we choose the correct ClassLoader. Examples include all plug-in 
 interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically 
 the following code snippet shows the idea:
 {code}
 package org.apache.hadoop.conf;
 public class Configuration implements IterableMap.EntryString,String {
...
   /**
* Load a class by name.
* 
* @param name the class name.
* @return the class object.
* @throws ClassNotFoundException if the class is not found.
*/
   public Class? getClassByName(String name) throws ClassNotFoundException {
 return Class.forName(name, true, classLoader);
   }
 {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-559) Support JDBC ResultSetMetadata

2009-06-24 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-559:
--

Issue Type: Sub-task  (was: New Feature)
Parent: HIVE-576

 Support JDBC ResultSetMetadata
 --

 Key: HIVE-559
 URL: https://issues.apache.org/jira/browse/HIVE-559
 Project: Hadoop Hive
  Issue Type: Sub-task
  Components: Clients
Reporter: Bill Graham
Assignee: Min Zhou

 Support ResultSetMetadata for JDBC ResultSets. The getColumn* methods would 
 be particularly useful I'd expect:
 http://java.sun.com/javase/6/docs/api/java/sql/ResultSetMetaData.html
 The challenge as I see it though, is that the JDBC client only has access to 
 the raw query string and the result data when running in standalone mode. 
 Therefore, it will need to get the column metadata one of two way: 
 1. By parsing the query to determine the tables/columns involved and then 
 making a request to the metastore to get the metadata for the columns. This 
 certainly feels like duplicate work, since the query of course gets properly 
 parsed on the server.
 2. By returning the column metadata from the server. My thrift knowledge is 
 limited, but I suspect adding this to the response would present other 
 challenges.
 Any thoughts or suggestions? Option #1 feels clunkier, yet safer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer

2009-06-23 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-567:
--

Attachment: sqlexplorer.jpg

 jdbc: integrate hive with pentaho report designer
 -

 Key: HIVE-567
 URL: https://issues.apache.org/jira/browse/HIVE-567
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
 Fix For: 0.4.0

 Attachments: hive-567-server-output.txt, hive-567.1.patch, 
 hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, result.jpg, 
 sqlexplorer.jpg, tables.jpg


 Instead of trying to get a complete implementation of jdbc, its probably more 
 useful to pick reporting/analytics software out there and implement the jdbc 
 methods necessary to get them working. This jira is a first attempt at this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-338) Executing cli commands into thrift server

2009-06-23 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723400#action_12723400
 ] 

Min Zhou commented on HIVE-338:
---

Can you exlain why you made a change at FunctionTask .java? It caused a 
java.lang.ClassNotFoundException when I executing my udf. 
ClassLoader didnot work.

 Executing cli commands into thrift server
 -

 Key: HIVE-338
 URL: https://issues.apache.org/jira/browse/HIVE-338
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Affects Versions: 0.3.0
Reporter: Min Zhou
Assignee: Zheng Shao
 Fix For: 0.4.0

 Attachments: hive-338.final.patch, HIVE-338.postfix.1.patch, 
 hiveserver-v1.patch, hiveserver-v2.patch, hiveserver-v3.patch


 Let thrift server support set, add/delete file/jar and normal HSQL query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HIVE-338) Executing cli commands into thrift server

2009-06-23 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12723400#action_12723400
 ] 

Min Zhou edited comment on HIVE-338 at 6/23/09 7:28 PM:


Can you exlain why you made a change at FunctionTask .java? It caused a 
java.lang.ClassNotFoundException when I executing my udf where mr jobs were 
submitted by hive cli. 
ClassLoader didnot work.

  was (Author: coderplay):
Can you exlain why you made a change at FunctionTask .java? It caused a 
java.lang.ClassNotFoundException when I executing my udf. 
ClassLoader didnot work.
  
 Executing cli commands into thrift server
 -

 Key: HIVE-338
 URL: https://issues.apache.org/jira/browse/HIVE-338
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Affects Versions: 0.3.0
Reporter: Min Zhou
Assignee: Zheng Shao
 Fix For: 0.4.0

 Attachments: hive-338.final.patch, HIVE-338.postfix.1.patch, 
 hiveserver-v1.patch, hiveserver-v2.patch, hiveserver-v3.patch


 Let thrift server support set, add/delete file/jar and normal HSQL query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer

2009-06-22 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-567:
--

Attachment: tables.jpg

 jdbc: integrate hive with pentaho report designer
 -

 Key: HIVE-567
 URL: https://issues.apache.org/jira/browse/HIVE-567
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
 Fix For: 0.4.0

 Attachments: hive-567-server-output.txt, hive-567.1.patch, 
 hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, tables.jpg


 Instead of trying to get a complete implementation of jdbc, its probably more 
 useful to pick reporting/analytics software out there and implement the jdbc 
 methods necessary to get them working. This jira is a first attempt at this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-567) jdbc: integrate hive with pentaho report designer

2009-06-22 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou reassigned HIVE-567:
-

Assignee: Min Zhou  (was: Raghotham Murthy)

 jdbc: integrate hive with pentaho report designer
 -

 Key: HIVE-567
 URL: https://issues.apache.org/jira/browse/HIVE-567
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Raghotham Murthy
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: hive-567-server-output.txt, hive-567.1.patch, 
 hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, result.jpg, tables.jpg


 Instead of trying to get a complete implementation of jdbc, its probably more 
 useful to pick reporting/analytics software out there and implement the jdbc 
 methods necessary to get them working. This jira is a first attempt at this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer

2009-06-22 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-567:
--

Attachment: (was: result.jpg)

 jdbc: integrate hive with pentaho report designer
 -

 Key: HIVE-567
 URL: https://issues.apache.org/jira/browse/HIVE-567
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Raghotham Murthy
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: hive-567-server-output.txt, hive-567.1.patch, 
 hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, tables.jpg


 Instead of trying to get a complete implementation of jdbc, its probably more 
 useful to pick reporting/analytics software out there and implement the jdbc 
 methods necessary to get them working. This jira is a first attempt at this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer

2009-06-22 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-567:
--

Attachment: result.jpg

 jdbc: integrate hive with pentaho report designer
 -

 Key: HIVE-567
 URL: https://issues.apache.org/jira/browse/HIVE-567
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Raghotham Murthy
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: hive-567-server-output.txt, hive-567.1.patch, 
 hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, result.jpg, tables.jpg


 Instead of trying to get a complete implementation of jdbc, its probably more 
 useful to pick reporting/analytics software out there and implement the jdbc 
 methods necessary to get them working. This jira is a first attempt at this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-567) jdbc: integrate hive with pentaho report designer

2009-06-22 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou reassigned HIVE-567:
-

Assignee: Raghotham Murthy  (was: Min Zhou)

incorrect manipulation

 jdbc: integrate hive with pentaho report designer
 -

 Key: HIVE-567
 URL: https://issues.apache.org/jira/browse/HIVE-567
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
 Fix For: 0.4.0

 Attachments: hive-567-server-output.txt, hive-567.1.patch, 
 hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, result.jpg, tables.jpg


 Instead of trying to get a complete implementation of jdbc, its probably more 
 useful to pick reporting/analytics software out there and implement the jdbc 
 methods necessary to get them working. This jira is a first attempt at this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-567) jdbc: integrate hive with pentaho report designer

2009-06-22 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-567:
--

Comment: was deleted

(was: incorrect manipulation)

 jdbc: integrate hive with pentaho report designer
 -

 Key: HIVE-567
 URL: https://issues.apache.org/jira/browse/HIVE-567
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Clients
Reporter: Raghotham Murthy
Assignee: Raghotham Murthy
 Fix For: 0.4.0

 Attachments: hive-567-server-output.txt, hive-567.1.patch, 
 hive-567.2.patch, hive-567.3.patch, hive-pentaho.tgz, result.jpg, tables.jpg


 Instead of trying to get a complete implementation of jdbc, its probably more 
 useful to pick reporting/analytics software out there and implement the jdbc 
 methods necessary to get them working. This jira is a first attempt at this. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF

2009-06-18 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12721169#action_12721169
 ] 

Min Zhou commented on HIVE-521:
---

I didn't  think all tests would pass due to the shortage of a class 
BinaryComparable. The reason why failing has nothing to do with this jira.  you 
can check out the trunk,and do 
ant -Dhadoop.version=0.17.0 test -Doverwrite=true
then error message will be displayed.
...
[junit] Exception: org/apache/hadoop/io/BinaryComparable
[junit] java.lang.NoClassDefFoundError: 
org/apache/hadoop/io/BinaryComparable
[junit] at java.lang.Class.getDeclaredConstructors0(Native Method)
[junit] at 
java.lang.Class.privateGetDeclaredConstructors(Class.java:2389)
[junit] at java.lang.Class.getConstructor0(Class.java:2699)
[junit] at java.lang.Class.newInstance0(Class.java:326)
[junit] at java.lang.Class.newInstance(Class.java:308)
[junit] at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.getUDFMethod(FunctionRegistry.java:309)
[junit] at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:451)
[junit] at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:558)
[junit] at 
org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:653)
[junit] at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80)
[junit] at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83)
[junit] at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:116)
[junit] at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95)
[junit] at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:3922)
[junit] at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1000)
[junit] at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:986)
[junit] at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:3163)
[junit] at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:3610)
[junit] at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:3840)
[junit] at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:76)
[junit] at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:44)
[junit] at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:76)
[junit] at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:177)
[junit] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:209)
[junit] at 
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:176)
[junit] at 
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:216)
[junit] at 
org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:471)
[junit] at 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_case_sensitivity(TestCliDriver.java:726)
[junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[junit] at java.lang.reflect.Method.invoke(Method.java:597)
[junit] at junit.framework.TestCase.runTest(TestCase.java:154)
[junit] at junit.framework.TestCase.runBare(TestCase.java:127)
[junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
[junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
[junit] at junit.framework.TestResult.run(TestResult.java:109)
[junit] at junit.framework.TestCase.run(TestCase.java:118)
[junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
[junit] at junit.framework.TestSuite.run(TestSuite.java:203)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:297)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:672)
[junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:567)
[junit] Caused by: java.lang.ClassNotFoundException: 
org.apache.hadoop.io.BinaryComparable
[junit] at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
[junit

[jira] Commented: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF

2009-06-18 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12721231#action_12721231
 ] 

Min Zhou commented on HIVE-521:
---

@ HIVE-521-all-v7.patch 

# {code:java}
boolean conditionTypeIsOk = (arguments[0].getCategory() == 
ObjectInspector.Category.PRIMITIVE);
if (conditionTypeIsOk) {
  PrimitiveObjectInspector poi = ((PrimitiveObjectInspector)arguments[0]);
  conditionTypeIsOk = (poi.getPrimitiveCategory() == 
PrimitiveObjectInspector.PrimitiveCategory.BOOLEAN
   || poi.getPrimitiveCategory() == 
PrimitiveObjectInspector.PrimitiveCategory.VOID);
}
if (!conditionTypeIsOk) {
  throw new UDFArgumentTypeException(0,
  The first argument of function IF should be \ + 
Constants.BOOLEAN_TYPE_NAME
  + \, but \ + arguments[0].getTypeName() + \ is found);
}
{code}
# {code:java}
String typeName = arguments[0].getTypeName();
if (!typeName.equals(Constants.BOOLEAN_TYPE_NAME)
|| !typeName.equals(Constants.VOID_TYPE_NAME)) {
  throw new UDFArgumentTypeException(0,
  The first expression of function IF is expected to \ + 
Constants.BOOLEAN_TYPE_NAME
  + \, but \ + arguments[0].getTypeName() + \ is found);
}
{code}

I though the 2nd approach is more concise, do you think so?

 Move size, if, isnull, isnotnull to GenericUDF
 --

 Key: HIVE-521
 URL: https://issues.apache.org/jira/browse/HIVE-521
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Zheng Shao
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, 
 HIVE-521-all-v3.patch, HIVE-521-all-v4.patch, HIVE-521-all-v5.patch, 
 HIVE-521-all-v6.patch, HIVE-521-all-v7.patch, HIVE-521-IF-2.patch, 
 HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, 
 HIVE-521-IF.patch


 See HIVE-511 for an example of the move.
 size, if, isnull, isnotnull are all implemented with UDF but they are 
 actually working on variable types of objects. We should move them to 
 GenericUDF for better type handling.
 This also helps to clean up the hack in doing type matching/type conversion 
 in UDF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF

2009-06-18 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12721595#action_12721595
 ] 

Min Zhou commented on HIVE-521:
---

ok, we are hairsplitting. passed all tests here, let commit it .
+1

 Move size, if, isnull, isnotnull to GenericUDF
 --

 Key: HIVE-521
 URL: https://issues.apache.org/jira/browse/HIVE-521
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Zheng Shao
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, 
 HIVE-521-all-v3.patch, HIVE-521-all-v4.patch, HIVE-521-all-v5.patch, 
 HIVE-521-all-v6.patch, HIVE-521-all-v7.patch, HIVE-521-IF-2.patch, 
 HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, 
 HIVE-521-IF.patch


 See HIVE-511 for an example of the move.
 size, if, isnull, isnotnull are all implemented with UDF but they are 
 actually working on variable types of objects. We should move them to 
 GenericUDF for better type handling.
 This also helps to clean up the hack in doing type matching/type conversion 
 in UDF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF

2009-06-17 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-521:
--

Attachment: HIVE-521-all-v5.patch

 Move size, if, isnull, isnotnull to GenericUDF
 --

 Key: HIVE-521
 URL: https://issues.apache.org/jira/browse/HIVE-521
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Zheng Shao
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, 
 HIVE-521-all-v3.patch, HIVE-521-all-v4.patch, HIVE-521-all-v5.patch, 
 HIVE-521-IF-2.patch, HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, 
 HIVE-521-IF-5.patch, HIVE-521-IF.patch


 See HIVE-511 for an example of the move.
 size, if, isnull, isnotnull are all implemented with UDF but they are 
 actually working on variable types of objects. We should move them to 
 GenericUDF for better type handling.
 This also helps to clean up the hack in doing type matching/type conversion 
 in UDF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF

2009-06-17 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-521:
--

Attachment: HIVE-521-all-v6.patch

 Move size, if, isnull, isnotnull to GenericUDF
 --

 Key: HIVE-521
 URL: https://issues.apache.org/jira/browse/HIVE-521
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Zheng Shao
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, 
 HIVE-521-all-v3.patch, HIVE-521-all-v4.patch, HIVE-521-all-v5.patch, 
 HIVE-521-all-v6.patch, HIVE-521-IF-2.patch, HIVE-521-IF-3.patch, 
 HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, HIVE-521-IF.patch


 See HIVE-511 for an example of the move.
 size, if, isnull, isnotnull are all implemented with UDF but they are 
 actually working on variable types of objects. We should move them to 
 GenericUDF for better type handling.
 This also helps to clean up the hack in doing type matching/type conversion 
 in UDF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HIVE-564) sweep the non-open source elements from hive

2009-06-16 Thread Min Zhou (JIRA)

sweep the non-open source elements from hive


 Key: HIVE-564
 URL: https://issues.apache.org/jira/browse/HIVE-564
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Min Zhou
 Fix For: 0.4.0


There are some non-open source things from facebook in current version of Hive, 
we should replace them with an open-source version of fb303.jar, libthrift.jar, 
etc, this open-source community are more likely to amend the relevant code.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HIVE-521) Move size, if, isnull, isnotnull to GenericUDF

2009-06-16 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou updated HIVE-521:
--

Attachment: HIVE-521-all-v4.patch

passed tests on hadoop version 0.17.0.

 Move size, if, isnull, isnotnull to GenericUDF
 --

 Key: HIVE-521
 URL: https://issues.apache.org/jira/browse/HIVE-521
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.4.0
Reporter: Zheng Shao
Assignee: Min Zhou
 Fix For: 0.4.0

 Attachments: HIVE-521-all-v1.patch, HIVE-521-all-v2.patch, 
 HIVE-521-all-v3.patch, HIVE-521-all-v4.patch, HIVE-521-IF-2.patch, 
 HIVE-521-IF-3.patch, HIVE-521-IF-4.patch, HIVE-521-IF-5.patch, 
 HIVE-521-IF.patch


 See HIVE-511 for an example of the move.
 size, if, isnull, isnotnull are all implemented with UDF but they are 
 actually working on variable types of objects. We should move them to 
 GenericUDF for better type handling.
 This also helps to clean up the hack in doing type matching/type conversion 
 in UDF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-338) Executing cli commands into thrift server

2009-06-16 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12720462#action_12720462
 ] 

Min Zhou commented on HIVE-338:
---

I think you should take a look at these lines of 
org.apache.hadoop.conf.Configuration
{code:java}
  private ClassLoader classLoader;
  {
classLoader = Thread.currentThread().getContextClassLoader();
if (classLoader == null) {
  classLoader = Configuration.class.getClassLoader();
}
  }
...

  public Class? getClassByName(String name) throws ClassNotFoundException {
return Class.forName(name, true, classLoader);
  }
{code}

ClassLoader of current thread changed  when adding jars into ClassPath,  conf 
hasnot synchronously get that change. 

 Executing cli commands into thrift server
 -

 Key: HIVE-338
 URL: https://issues.apache.org/jira/browse/HIVE-338
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Affects Versions: 0.3.0
Reporter: Min Zhou
Assignee: Min Zhou
 Attachments: hiveserver-v1.patch, hiveserver-v2.patch, 
 hiveserver-v3.patch


 Let thrift server support set, add/delete file/jar and normal HSQL query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-556) let hive support theta join

2009-06-15 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12719518#action_12719518
 ] 

Min Zhou commented on HIVE-556:
---

I didn't see any filter there,  hive will put all fields of my small table into 
HTree.

{noformat}
hiveexplain select /*+ MAPJOIN(a) */ a.url_pattern, w.url from application a 
join web_log w where w.logdate='20090611' and w.url rlike a.url_pattern and 
a.dt='20090609';

Common Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {bussiness_id} {subclass_id} {class_id} {note} {name} 
{url_pattern} {dt}
1
{noformat}

We only put a.url_pattern into a HashMap in our raw map-reduce implemenation.

 let hive support theta join
 ---

 Key: HIVE-556
 URL: https://issues.apache.org/jira/browse/HIVE-556
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.4.0
Reporter: Min Zhou
 Fix For: 0.4.0


 Right now , hive only support equal joins .  Sometimes it's not enough, we 
 must consider implementing theta joins like
 {code:sql}
 SELECT
   a.subid, a.id, t.url
 FROM
   tbl t JOIN aux_tbl a ON t.url rlike a.url_pattern
 WHERE
   t.dt='20090609'
   AND a.dt='20090609';
 {code}
 any condition expression following 'ON' is  appropriate.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Assigned: (HIVE-559) Support JDBC ResultSetMetadata

2009-06-15 Thread Min Zhou (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Min Zhou reassigned HIVE-559:
-

Assignee: Min Zhou

 Support JDBC ResultSetMetadata
 --

 Key: HIVE-559
 URL: https://issues.apache.org/jira/browse/HIVE-559
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Clients
Reporter: Bill Graham
Assignee: Min Zhou

 Support ResultSetMetadata for JDBC ResultSets. The getColumn* methods would 
 be particularly useful I'd expect:
 http://java.sun.com/javase/6/docs/api/java/sql/ResultSetMetaData.html
 The challenge as I see it though, is that the JDBC client only has access to 
 the raw query string and the result data when running in standalone mode. 
 Therefore, it will need to get the column metadata one of two way: 
 1. By parsing the query to determine the tables/columns involved and then 
 making a request to the metastore to get the metadata for the columns. This 
 certainly feels like duplicate work, since the query of course gets properly 
 parsed on the server.
 2. By returning the column metadata from the server. My thrift knowledge is 
 limited, but I suspect adding this to the response would present other 
 challenges.
 Any thoughts or suggestions? Option #1 feels clunkier, yet safer.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HIVE-474) Support for distinct selection on two or more columns

2009-06-14 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12719368#action_12719368
 ] 

Min Zhou edited comment on HIVE-474 at 6/14/09 7:02 PM:


I thought there is another special case here.  If the query has multiple 
distinct operations on the same column , we can push down the evaluation of 
those expressions into reducers.
{code}
Query:
  select a, count(distinct if(codition, b, null)) as col1, count(distinct 
if(!condition, null, b)) as col2, count(distinct b) as col3

Plan:
  Job :
Map side:
  Emit: distribution_key: a, sort_key: a, b, value: nothing
Reduce side:
  Group By
a,  count col1, col2, col3 by evaluating their expressions
{code}

  was (Author: coderplay):
I thought there is another special case here.  If the query has multiple 
distinct operations on the same column , we can push down the evaluation of 
those expressions into reducers.

Query:
  select a, count(distinct if(codition, b, null)) as col1, count(distinct 
if(!condition, null, b)) as col2, count(distinct b) as col3

Plan:
  Job :
Map side:
  Emit: distribution_key: a, sort_key: a, b, value: nothing
Reduce side:
  Group By
a,  count col1, col2, col3 by evaluating their expressions
  
 Support for distinct selection on two or more columns
 -

 Key: HIVE-474
 URL: https://issues.apache.org/jira/browse/HIVE-474
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Alexis Rondeau

 The ability to select distinct several, individual columns as by example: 
 select count(distinct user), count(distinct session) from actions;   
 Currently returns the following failure: 
 FAILED: Error in semantic analysis: line 2:7 DISTINCT on Different Columns 
 not Supported user

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-474) Support for distinct selection on two or more columns

2009-06-14 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12719368#action_12719368
 ] 

Min Zhou commented on HIVE-474:
---

I thought there is another special case here.  If the query has multiple 
distinct operations on the same column , we can push down the evaluation of 
those expressions into reducers.

Query:
  select a, count(distinct if(codition, b, null)) as col1, count(distinct 
if(!condition, null, b)) as col2, count(distinct b) as col3

Plan:
  Job :
Map side:
  Emit: distribution_key: a, sort_key: a, b, value: nothing
Reduce side:
  Group By
a,  count col1, col2, col3 by evaluating their expressions

 Support for distinct selection on two or more columns
 -

 Key: HIVE-474
 URL: https://issues.apache.org/jira/browse/HIVE-474
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Alexis Rondeau

 The ability to select distinct several, individual columns as by example: 
 select count(distinct user), count(distinct session) from actions;   
 Currently returns the following failure: 
 FAILED: Error in semantic analysis: line 2:7 DISTINCT on Different Columns 
 not Supported user

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HIVE-338) Executing cli commands into thrift server

2009-06-11 Thread Min Zhou (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12717651#action_12717651
 ] 

Min Zhou edited comment on HIVE-338 at 6/10/09 11:37 PM:
-

* exec/FunctionTask.java: is it necessary to specify the loader in the 
Class.forName call? I thought that that the current thread context loader was 
the always the first loader to be tried anyway during name resolution.
Yes, of course. the class loader holding by HiveConf is older than that of 
current thread.

this pacth support dfs, add/delete file/jar, set now.  

btw, Joydeep, would you do me a favor writing some test code that I am not 
familiar with?  you know, ' add jar'  need a separate jar, and i not quite sure 
how to organize them.

  was (Author: coderplay):

* exec/FunctionTask.java: is it necessary to specify the loader in the 
Class.forName call? I thought that that the current thread context loader was 
the always the first loader to be tried anyway during name resolution.
Yes, of course. the class loader holding by HiveConf is older than that of 
current thread.

this pacth support dfs, add/delete file/jar, set now.  

btw, Joydeep, would you do me a favor writing some test code that I' am not 
familiar with it ?  you know, ' add jar'  need a separate jar, and i not quite 
sure how to organize them.
  
 Executing cli commands into thrift server
 -

 Key: HIVE-338
 URL: https://issues.apache.org/jira/browse/HIVE-338
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Affects Versions: 0.3.0
Reporter: Min Zhou
 Attachments: hiveserver-v1.patch, hiveserver-v2.patch


 Let thrift server support set, add/delete file/jar and normal HSQL query.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

1 2 >

1 - 100 of 162 matches

Mail list logo