[jira] [Commented] (HIVE-2748) Upgrade Hbase and ZK dependcies

2012-03-13 Thread Ashutosh Chauhan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229007#comment-13229007
 ] 

Ashutosh Chauhan commented on HIVE-2748:


TestHBaseSerDe fails with latest patch. Perhaps, jackson libs need to be upped 
to 1.7.1 as well.

> Upgrade Hbase and ZK dependcies
> ---
>
> Key: HIVE-2748
> URL: https://issues.apache.org/jira/browse/HIVE-2748
> Project: Hive
>  Issue Type: Task
>Affects Versions: 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0
>Reporter: Ashutosh Chauhan
>Assignee: Enis Soztutar
> Attachments: HIVE-2748.3.patch, HIVE-2748.D1431.1.patch, 
> HIVE-2748.D1431.2.patch, HIVE-2748_v4.patch, HIVE-2748_v5.patch, 
> HIVE-2748_v6.patch, HIVE-2748_v7.patch
>
>
> Both softwares have moved forward with significant improvements. Lets bump 
> compile time dependency to keep up

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Hive

2012-03-13 Thread indrani gorti
Hi
I understand that when we start the CLI we are in the default database.
This is rooted at

hive.warehouse.dir which is typically rooted at /user/hive/warehouse

we create a database in the default location is /user/hive/warehouse/
+ databasename +".db"
Cant we have two databases in the same location.  If I yes, how does
Hive differentiate between the databases and how is the appropriate
mapping done to tables.

Thanks in advance.

Indrani


[jira] [Commented] (HIVE-2609) NPE when pruning partitions by thrift method get_partitions_by_filter

2012-03-13 Thread Travis Crawford (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228911#comment-13228911
 ] 

Travis Crawford commented on HIVE-2609:
---

I ran into this today too and, in addition to updating the two jars Thomas 
mentioned, also had to update:

https://github.com/apache/hive/blob/trunk/metastore/src/model/package.jdo#L49

In our hive tables the column is named "COMMENT" - not "FCOMMENT". Without 
updating datanucleus things work fine, but this change is required when 
updating jars. I don't understand why the change in behavior yet though.

> NPE when pruning partitions by thrift method get_partitions_by_filter
> -
>
> Key: HIVE-2609
> URL: https://issues.apache.org/jira/browse/HIVE-2609
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Affects Versions: 0.7.1
>Reporter: Min Zhou
>
> It's a datanucleus bug indeed. 
> try this code:
> {code}
> boolean open = false;
> for (int i = 0; i < 5 && !open; ++i) {
>   try {
> transport.open();
> open = true;
>   } catch (TTransportException e) {
> System.out.println("failed to connect to MetaStore, re-trying...");
> try {
>   Thread.sleep(1000);
> } catch (InterruptedException ignore) {}
>   }
> }
> try {
>   List parts =
>   client.get_partitions_by_filter("default", "partitioned_nation",
>   "pt < '2'", (short) -1);
>   for (Partition part : parts) {
> System.out.println(part.getSd().getLocation());
>   }
> } catch (Exception te) {
>   te.printStackTrace();
> }
> {code}
> A NPEexception would be thrown on the thrift server side
> {noformat}
> 11/11/25 13:11:55 ERROR api.ThriftHiveMetastore$Processor: Internal error 
> processing get_partitions_by_filter
> java.lang.NullPointerException
> at 
> org.datanucleus.store.mapped.mapping.MappingHelper.getMappingIndices(MappingHelper.java:35)
> at 
> org.datanucleus.store.mapped.expression.StatementText.applyParametersToStatement(StatementText.java:194)
> at 
> org.datanucleus.store.rdbms.query.RDBMSQueryUtils.getPreparedStatementForQuery(RDBMSQueryUtils.java:233)
> at 
> org.datanucleus.store.rdbms.query.legacy.SQLEvaluator.evaluate(SQLEvaluator.java:115)
> at 
> org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.performExecute(JDOQLQuery.java:288)
> at org.datanucleus.store.query.Query.executeQuery(Query.java:1657)
> at 
> org.datanucleus.store.rdbms.query.legacy.JDOQLQuery.executeQuery(JDOQLQuery.java:245)
> at org.datanucleus.store.query.Query.executeWithMap(Query.java:1526)
> at org.datanucleus.jdo.JDOQuery.executeWithMap(JDOQuery.java:334)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.listMPartitionsByFilter(ObjectStore.java:1329)
> at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:1241)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$40.run(HiveMetaStore.java:2369)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$40.run(HiveMetaStore.java:2366)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:307)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:2366)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.process(ThriftHiveMetastore.j
> ava:6099)
> at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor.process(ThriftHiveMetastore.java:4789)
> at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$TLoggingProcessor.process(HiveMetaStore.java:3167)
> at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> A null JavaTypeMapping was passed into 
> org.datanucleus.store.mapped.mapping.MappingHelper.(int initialPosition, 
> JavaTypeMapping mapping), that caused NPE.
> After digged into the datanucleus source, I found that the null value was 
> born in the constructor of 
> org.datanucleus.store.mapped.expression.SubstringExpression. see
> {code}
> /**
>  * Constructs the substring
>  * @param str the String Expression
>  * @param begin The start position
>  * @param end The end position expression
>  **/   
> public SubstringExpression(StringExpression str, NumericExpression begin, 
> NumericExpression end)
> {
> super(str.getQueryExpression());
> st.app

[jira] [Updated] (HIVE-2748) Upgrade Hbase and ZK dependcies

2012-03-13 Thread Enis Soztutar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HIVE-2748:


Attachment: HIVE-2748_v7.patch

Uploading another patch for fixing the maven pattern to add the classifier. 

> Upgrade Hbase and ZK dependcies
> ---
>
> Key: HIVE-2748
> URL: https://issues.apache.org/jira/browse/HIVE-2748
> Project: Hive
>  Issue Type: Task
>Affects Versions: 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0
>Reporter: Ashutosh Chauhan
>Assignee: Enis Soztutar
> Attachments: HIVE-2748.3.patch, HIVE-2748.D1431.1.patch, 
> HIVE-2748.D1431.2.patch, HIVE-2748_v4.patch, HIVE-2748_v5.patch, 
> HIVE-2748_v6.patch, HIVE-2748_v7.patch
>
>
> Both softwares have moved forward with significant improvements. Lets bump 
> compile time dependency to keep up

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2748) Upgrade Hbase and ZK dependcies

2012-03-13 Thread Ashutosh Chauhan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228903#comment-13228903
 ] 

Ashutosh Chauhan commented on HIVE-2748:


Unable to compile tests. Probably, ivysettings.xml is missing m2:classifier bit.

> Upgrade Hbase and ZK dependcies
> ---
>
> Key: HIVE-2748
> URL: https://issues.apache.org/jira/browse/HIVE-2748
> Project: Hive
>  Issue Type: Task
>Affects Versions: 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0
>Reporter: Ashutosh Chauhan
>Assignee: Enis Soztutar
> Attachments: HIVE-2748.3.patch, HIVE-2748.D1431.1.patch, 
> HIVE-2748.D1431.2.patch, HIVE-2748_v4.patch, HIVE-2748_v5.patch, 
> HIVE-2748_v6.patch
>
>
> Both softwares have moved forward with significant improvements. Lets bump 
> compile time dependency to keep up

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Potential bug around hive merging of small files

2012-03-13 Thread Shrijeet Paliwal
I have opened https://issues.apache.org/jira/browse/HIVE-2869

On Tue, Mar 13, 2012 at 8:37 AM, Ashutosh Chauhan wrote:

> This does look like a bug. Shrijeet, mind opening a jira and attaching your
> patch there.
>
> Thanks,
> Ashutosh
> On Mon, Mar 12, 2012 at 16:29, Shrijeet Paliwal  >wrote:
>
> > I had a type in last email. Settings are as follows
> >
> > hive> set mapred.min.split.size.per.node=10;
> > hive> set mapred.min.split.size.per.rack=10;
> > hive> set mapred.max.split.size=10;
> > hive> set hive.merge.size.per.task=10;
> > hive> set hive.merge.smallfiles.avgsize=10;
> > hive> set hive.merge.size.smallfiles.avgsize=10;*hive> set
> > hive.merge.mapfiles=true;*hive> set hive.merge.mapredfiles=true;
> >
> > *hive> set hive.mergejob.maponly=false;*
> >
> >
> >
> >
> > On Mon, Mar 12, 2012 at 4:27 PM, Shrijeet Paliwal
> > wrote:
> >
> > > Hive Version: Hive 0.8 (last commit SHA
> > >  b581a6192b8d4c544092679d05f45b2e50d42b45 )
> > >
> > > Hadoop version : chd3u0
> > >
> > > I am trying to use the hive merge small file feature by setting all the
> > > necessary params.
> > > I am disabling use of CombineHiveInputFormat since my input is
> compressed
> > > text.
> > >
> > > hive> set mapred.min.split.size.per.node=10;
> > > hive> set mapred.min.split.size.per.rack=10;
> > > hive> set mapred.max.split.size=10;
> > > hive> set hive.merge.size.per.task=10;
> > > hive> set hive.merge.smallfiles.avgsize=10;
> > > hive> set hive.merge.size.smallfiles.avgsize=10;
> > > hive> set hive.merge.mapfiles=false;
> > > hive> set hive.merge.mapredfiles=true;
> > >
> > >
> > > The plan decides to launch two MR jobs but after first job succeeds I
> get
> > > runt time error
> > >
> > > "java.lang.RuntimeException: Plan invalid, Reason: Reducers == 0 but
> > > reduce operator specified"
> > >
> > > I think the problem can be fixed by using this patch I came with :
> > > https://gist.github.com/2025303
> > >
> > > Of course my understanding and hence this patch can be totally wrong.
> > > Please provide feedback.
> > >
> >
>


[jira] [Updated] (HIVE-2869) Merging small files throws RuntimeException when hive.mergejob.maponly=false

2012-03-13 Thread Shrijeet Paliwal (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shrijeet Paliwal updated HIVE-2869:
---

Attachment: data_to_reproduce.tar.gz

> Merging small files throws RuntimeException when hive.mergejob.maponly=false
> 
>
> Key: HIVE-2869
> URL: https://issues.apache.org/jira/browse/HIVE-2869
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 0.8.0
> Environment: CentOS release 5.5 (Final)
>Reporter: Shrijeet Paliwal
> Attachments: data_to_reproduce.tar.gz
>
>
> Hive Version: Hive 0.8 (last commit SHA  
> b581a6192b8d4c544092679d05f45b2e50d42b45 ) 
> Hadoop version : chd3u0
> Trying to use the hive merge small file feature by setting all the necessary 
> params.
> Have disabled use of CombineHiveInputFormat since my input is compressed 
> text. 
> {noformat}
> hive> set mapred.min.split.size.per.node=10;
> hive> set mapred.min.split.size.per.rack=10;
> hive> set mapred.max.split.size=10;
> hive> set hive.merge.size.per.task=10;
> hive> set hive.merge.smallfiles.avgsize=10;
> hive> set hive.merge.size.smallfiles.avgsize=10;
> hive> set hive.merge.mapfiles=true;
> hive> set hive.merge.mapredfiles=true;
> hive> set hive.mergejob.maponly=false;
> {noformat}
> The plan decides to launch two MR jobs but after first job succeeds I get 
> runt time error 
> "java.lang.RuntimeException: Plan invalid, Reason: Reducers == 0 but reduce 
> operator specified"
> *How to reproduce :* 
> * Creare tables as follows : 
> {code}
> --create input table
> create table tmp_notmerged (
>   idint,
>   name  string
> )
> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
> STORED AS TEXTFILE;
> --create o/p table
> create table tmp_merged (
>   idint
> )
> STORED AS TEXTFILE;
> {code}
> * Load data into tmp_notmerged (find files attached in with this jira)
> * set knobs and fire hive query 
> {code}
> set hive.merge.mapfiles=true;
> set hive.mergejob.maponly=false;
> insert overwrite table tmp_merged select id from tmp_notmerged;
> {code}
> * You should see error "java.lang.RuntimeException: Plan invalid, Reason: 
> Reducers == 0 but reduce operator specified"
> *Proposed fix :*
> Patch is here : https://gist.github.com/2025303

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2869) Merging small files throws RuntimeException when hive.mergejob.maponly=false

2012-03-13 Thread Shrijeet Paliwal (Created) (JIRA)
Merging small files throws RuntimeException when hive.mergejob.maponly=false


 Key: HIVE-2869
 URL: https://issues.apache.org/jira/browse/HIVE-2869
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.8.0
 Environment: CentOS release 5.5 (Final)
Reporter: Shrijeet Paliwal
 Attachments: data_to_reproduce.tar.gz

Hive Version: Hive 0.8 (last commit SHA  
b581a6192b8d4c544092679d05f45b2e50d42b45 ) 
Hadoop version : chd3u0

Trying to use the hive merge small file feature by setting all the necessary 
params.
Have disabled use of CombineHiveInputFormat since my input is compressed text. 

{noformat}
hive> set mapred.min.split.size.per.node=10;
hive> set mapred.min.split.size.per.rack=10;
hive> set mapred.max.split.size=10;
hive> set hive.merge.size.per.task=10;
hive> set hive.merge.smallfiles.avgsize=10;
hive> set hive.merge.size.smallfiles.avgsize=10;
hive> set hive.merge.mapfiles=true;
hive> set hive.merge.mapredfiles=true;
hive> set hive.mergejob.maponly=false;
{noformat}

The plan decides to launch two MR jobs but after first job succeeds I get runt 
time error 
"java.lang.RuntimeException: Plan invalid, Reason: Reducers == 0 but reduce 
operator specified"

*How to reproduce :* 

* Creare tables as follows : 
{code}
--create input table
create table tmp_notmerged (
  idint,
  name  string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE;


--create o/p table
create table tmp_merged (
  idint
)
STORED AS TEXTFILE;
{code}

* Load data into tmp_notmerged (find files attached in with this jira)

* set knobs and fire hive query 
{code}
set hive.merge.mapfiles=true;
set hive.mergejob.maponly=false;
insert overwrite table tmp_merged select id from tmp_notmerged;
{code}

* You should see error "java.lang.RuntimeException: Plan invalid, Reason: 
Reducers == 0 but reduce operator specified"


*Proposed fix :*

Patch is here : https://gist.github.com/2025303

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2822) Add JSON output to the hive ddl commands

2012-03-13 Thread Alan Gates (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alan Gates updated HIVE-2822:
-

Attachment: HIVE-2822.04-branch-08.patch

A version of the patch that applies against 0.8 branch and fixes compile 
issues.  It also includes changes to ivy to pick up jackson.

> Add JSON output to the hive ddl commands
> 
>
> Key: HIVE-2822
> URL: https://issues.apache.org/jira/browse/HIVE-2822
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chris Dean
> Attachments: HIVE-2822.03-branch0-8.patch, HIVE-2822.03.patch, 
> HIVE-2822.03b.patch, HIVE-2822.04-branch-08.patch, 
> hive-json-01-branch0-8.patch, hive-json-01.patch, 
> hive-json-02-branch0-8.patch, hive-json-02.patch
>
>
> The goal is to have an option to produce JSON output of the DDL commands that 
> is easily machine parseable.
> For example, "desc my_table" currently gives
> {noformat}
> idbigint
> user  string
> {noformat} 
> and we want to allow a json output:
> {noformat}
> {
>   "columns": [
> {"name": "id", "type": "bigint"},
> {"name": "user", "type": "string"}
>   ]
> }
> {noformat} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Hive-trunk-h0.21 - Build # 1308 - Fixed

2012-03-13 Thread Apache Jenkins Server
Changes for Build #1305

Changes for Build #1306

Changes for Build #1307
[kevinwilfong] HIVE-2714. Lots of special characters are not handled in LIKE. 
(jonchang via kevinwilfong)


Changes for Build #1308



All tests passed

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1308)

Status: Fixed

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1308/ to 
view the results.

[jira] [Updated] (HIVE-2822) Add JSON output to the hive ddl commands

2012-03-13 Thread Chris Dean (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Dean updated HIVE-2822:
-

Attachment: HIVE-2822.03-branch0-8.patch

Patch against 0.8 with unit tests


> Add JSON output to the hive ddl commands
> 
>
> Key: HIVE-2822
> URL: https://issues.apache.org/jira/browse/HIVE-2822
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chris Dean
> Attachments: HIVE-2822.03-branch0-8.patch, HIVE-2822.03.patch, 
> HIVE-2822.03b.patch, hive-json-01-branch0-8.patch, hive-json-01.patch, 
> hive-json-02-branch0-8.patch, hive-json-02.patch
>
>
> The goal is to have an option to produce JSON output of the DDL commands that 
> is easily machine parseable.
> For example, "desc my_table" currently gives
> {noformat}
> idbigint
> user  string
> {noformat} 
> and we want to allow a json output:
> {noformat}
> {
>   "columns": [
> {"name": "id", "type": "bigint"},
> {"name": "user", "type": "string"}
>   ]
> }
> {noformat} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2748) Upgrade Hbase and ZK dependcies

2012-03-13 Thread Enis Soztutar (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HIVE-2748:


Attachment: HIVE-2748_v6.patch

Attaching rebased patch. 

> Upgrade Hbase and ZK dependcies
> ---
>
> Key: HIVE-2748
> URL: https://issues.apache.org/jira/browse/HIVE-2748
> Project: Hive
>  Issue Type: Task
>Affects Versions: 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0
>Reporter: Ashutosh Chauhan
>Assignee: Enis Soztutar
> Attachments: HIVE-2748.3.patch, HIVE-2748.D1431.1.patch, 
> HIVE-2748.D1431.2.patch, HIVE-2748_v4.patch, HIVE-2748_v5.patch, 
> HIVE-2748_v6.patch
>
>
> Both softwares have moved forward with significant improvements. Lets bump 
> compile time dependency to keep up

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2822) Add JSON output to the hive ddl commands

2012-03-13 Thread Chris Dean (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Dean updated HIVE-2822:
-

Attachment: HIVE-2822.03b.patch

Unit tests (with license transfer)

> Add JSON output to the hive ddl commands
> 
>
> Key: HIVE-2822
> URL: https://issues.apache.org/jira/browse/HIVE-2822
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chris Dean
> Attachments: HIVE-2822.03.patch, HIVE-2822.03b.patch, 
> hive-json-01-branch0-8.patch, hive-json-01.patch, 
> hive-json-02-branch0-8.patch, hive-json-02.patch
>
>
> The goal is to have an option to produce JSON output of the DDL commands that 
> is easily machine parseable.
> For example, "desc my_table" currently gives
> {noformat}
> idbigint
> user  string
> {noformat} 
> and we want to allow a json output:
> {noformat}
> {
>   "columns": [
> {"name": "id", "type": "bigint"},
> {"name": "user", "type": "string"}
>   ]
> }
> {noformat} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2822) Add JSON output to the hive ddl commands

2012-03-13 Thread Chris Dean (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Dean updated HIVE-2822:
-

Attachment: HIVE-2822.03.patch

Patch with unit tests

> Add JSON output to the hive ddl commands
> 
>
> Key: HIVE-2822
> URL: https://issues.apache.org/jira/browse/HIVE-2822
> Project: Hive
>  Issue Type: Improvement
>Reporter: Chris Dean
> Attachments: HIVE-2822.03.patch, hive-json-01-branch0-8.patch, 
> hive-json-01.patch, hive-json-02-branch0-8.patch, hive-json-02.patch
>
>
> The goal is to have an option to produce JSON output of the DDL commands that 
> is easily machine parseable.
> For example, "desc my_table" currently gives
> {noformat}
> idbigint
> user  string
> {noformat} 
> and we want to allow a json output:
> {noformat}
> {
>   "columns": [
> {"name": "id", "type": "bigint"},
> {"name": "user", "type": "string"}
>   ]
> }
> {noformat} 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2856) Fix TestCliDriver escape1.q failure on MR2

2012-03-13 Thread Zhenxiao Luo (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo updated HIVE-2856:
---

Status: Patch Available  (was: Open)

> Fix TestCliDriver escape1.q failure on MR2
> --
>
> Key: HIVE-2856
> URL: https://issues.apache.org/jira/browse/HIVE-2856
> Project: Hive
>  Issue Type: Bug
>Reporter: Zhenxiao Luo
>Assignee: Zhenxiao Luo
> Attachments: HIVE-2856.1.patch.txt, HIVE-2856.2.patch.txt, 
> escape1.q.out, escape1.q.out, escape2.q.out
>
>
> Additional '^' in escape test:
> [junit] Begin query: escape1.q
> [junit] Copying file: file:/home/cloudera/Code/hive/data/files/escapetest.txt
> [junit] 12/01/23 15:22:15 WARN conf.Configuration: mapred.system.dir is 
> deprecated. Instead, use mapreduce.jobtracker.system.dir
> [junit] 12/01/23 15:22:15 WARN conf.Configuration: mapred.local.dir is 
> deprecated. Instead, use mapreduce.cluster.local.dir
> [junit] diff -a -I file: -I pfile: -I hdfs: -I /tmp/ -I invalidscheme: -I 
> lastUpdateTime -I lastAccessTime -I [Oo]wner -I CreateTime -I LastAccessTime 
> -I Location -I LOCATION ' -I transient_lastDdlTime -I last_modified_ -I 
> java.lang.RuntimeException -I at org -I at sun -I at java -I at junit -I 
> Caused by: -I LOCK_QUERYID: -I LOCK_TIME: -I grantTime -I [.][.][.] [0-9]* 
> more -I job_[0-9]*_[0-9]* -I USING 'java -cp 
> /home/cloudera/Code/hive/build/ql/test/logs/clientpositive/escape1.q.out 
> /home/cloudera/Code/hive/ql/src/test/results/clientpositive/escape1.q.out
> [junit] 893d892
> [junit] < 1   1   ^
> [junit] junit.framework.AssertionFailedError: Client execution results failed 
> with error code = 1
> [junit] See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" 
> to get more logs.
> [junit] at junit.framework.Assert.fail(Assert.java:50)
> [junit] at 
> org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape1(TestCliDriver.java:131)
> [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> [junit] at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> [junit] at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> [junit] at java.lang.reflect.Method.invoke(Method.java:616)
> [junit] at junit.framework.TestCase.runTest(TestCase.java:168)
> [junit] at junit.framework.TestCase.runBare(TestCase.java:134)
> [junit] at junit.framework.TestResult$1.protect(TestResult.java:110)
> [junit] at junit.framework.TestResult.runProtected(TestResult.java:128)
> [junit] at junit.framework.TestResult.run(TestResult.java:113)
> [junit] at junit.framework.TestCase.run(TestCase.java:124)
> [junit] at junit.framework.TestSuite.runTest(TestSuite.java:243)
> [junit] at junit.framework.TestSuite.run(TestSuite.java:238)
> [junit] at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:420)
> [junit] at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:911)
> [junit] at 
> org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:768)
> [junit] Exception: Client execution results failed with error code = 1
> [junit] See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" 
> to get more logs.
> [junit] See build/ql/tmp/hive.log, or try "ant test ... -Dtest.silent=false" 
> to get more logs.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Potential bug around hive merging of small files

2012-03-13 Thread Ashutosh Chauhan
This does look like a bug. Shrijeet, mind opening a jira and attaching your
patch there.

Thanks,
Ashutosh
On Mon, Mar 12, 2012 at 16:29, Shrijeet Paliwal wrote:

> I had a type in last email. Settings are as follows
>
> hive> set mapred.min.split.size.per.node=10;
> hive> set mapred.min.split.size.per.rack=10;
> hive> set mapred.max.split.size=10;
> hive> set hive.merge.size.per.task=10;
> hive> set hive.merge.smallfiles.avgsize=10;
> hive> set hive.merge.size.smallfiles.avgsize=10;*hive> set
> hive.merge.mapfiles=true;*hive> set hive.merge.mapredfiles=true;
>
> *hive> set hive.mergejob.maponly=false;*
>
>
>
>
> On Mon, Mar 12, 2012 at 4:27 PM, Shrijeet Paliwal
> wrote:
>
> > Hive Version: Hive 0.8 (last commit SHA
> >  b581a6192b8d4c544092679d05f45b2e50d42b45 )
> >
> > Hadoop version : chd3u0
> >
> > I am trying to use the hive merge small file feature by setting all the
> > necessary params.
> > I am disabling use of CombineHiveInputFormat since my input is compressed
> > text.
> >
> > hive> set mapred.min.split.size.per.node=10;
> > hive> set mapred.min.split.size.per.rack=10;
> > hive> set mapred.max.split.size=10;
> > hive> set hive.merge.size.per.task=10;
> > hive> set hive.merge.smallfiles.avgsize=10;
> > hive> set hive.merge.size.smallfiles.avgsize=10;
> > hive> set hive.merge.mapfiles=false;
> > hive> set hive.merge.mapredfiles=true;
> >
> >
> > The plan decides to launch two MR jobs but after first job succeeds I get
> > runt time error
> >
> > "java.lang.RuntimeException: Plan invalid, Reason: Reducers == 0 but
> > reduce operator specified"
> >
> > I think the problem can be fixed by using this patch I came with :
> > https://gist.github.com/2025303
> >
> > Of course my understanding and hence this patch can be totally wrong.
> > Please provide feedback.
> >
>


[jira] [Commented] (HIVE-2646) Hive Ivy dependencies on Hadoop should depend on jars directly, not tarballs

2012-03-13 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228416#comment-13228416
 ] 

Phabricator commented on HIVE-2646:
---

abayer has commented on the revision "HIVE-2646 [jira] Hive Ivy dependencies on 
Hadoop should depend on jars directly, not tarballs".

  Latest patch had four test failures, only one of which I've seen more than 
once so far. That one is 
org.apache.hadoop.hive.serde2.lazy.TestLazySimpleSerDe.testLazySimpleSerDe. The 
others are 
org.apache.hadoop.hive.metastore.TestMarkPartition.testMarkingPartitionSet, 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample6, and 
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_udfnull.

REVISION DETAIL
  https://reviews.facebook.net/D2133


> Hive Ivy dependencies on Hadoop should depend on jars directly, not tarballs
> 
>
> Key: HIVE-2646
> URL: https://issues.apache.org/jira/browse/HIVE-2646
> Project: Hive
>  Issue Type: Bug
>  Components: Build Infrastructure
>Affects Versions: 0.8.0
>Reporter: Andrew Bayer
>Assignee: Andrew Bayer
>Priority: Critical
> Attachments: HIVE-2646.D2133.1.patch, HIVE-2646.D2133.10.patch, 
> HIVE-2646.D2133.11.patch, HIVE-2646.D2133.2.patch, HIVE-2646.D2133.3.patch, 
> HIVE-2646.D2133.4.patch, HIVE-2646.D2133.5.patch, HIVE-2646.D2133.6.patch, 
> HIVE-2646.D2133.7.patch, HIVE-2646.D2133.8.patch, HIVE-2646.D2133.9.patch, 
> HIVE-2646.diff.txt
>
>
> The current Hive Ivy dependency logic for its Hadoop dependencies is 
> problematic - depending on the tarball and extracting the jars from there, 
> rather than depending on the jars directly. It'd be great if this was fixed 
> to actually have the jar dependencies defined directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2868) Insert into table wipes out table content

2012-03-13 Thread Mauro Cazzari (Created) (JIRA)
Insert into table wipes out table content
-

 Key: HIVE-2868
 URL: https://issues.apache.org/jira/browse/HIVE-2868
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.1
Reporter: Mauro Cazzari
 Fix For: 0.8.1


The INSERT INTO  statement still wipes out the target table, even though 
I read in more than once place that the statement should behave in append mode. 
Is this true? If not, is there an available or upcoming fix for this?
Thanks! 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1245) allow access to values stored as non-strings in HBase

2012-03-13 Thread zengchuan (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228266#comment-13228266
 ] 

zengchuan commented on HIVE-1245:
-

I'm new to hbase and hive. I create a table in hbase and add data array into it.

public static void createTable(String tablename) throws IOException{  
HBaseAdmin admin = new HBaseAdmin(hbaseConfig);  
if(admin.tableExists(tablename)){  
System.out.println("table Exists!!!");  
}else{  
HTableDescriptor tableDesc = new HTableDescriptor(tablename);  
tableDesc.addFamily(new HColumnDescriptor("dom"));  
admin.createTable(tableDesc);  
}  
}

public static void addData(String tablename) throws IOException{  
HTable table=new HTable(hbaseConfig,tablename); 
Put put = new Put(Bytes.toBytes(String.valueOf(i)));

List a = new ArrayList();
a.add("domain1");
a.add("domain2");

Object obj = doType(hbaseConfig, a, List.class);
Writable w = new HbaseObjectWritable(obj);
byte[] depthMapByteArray = WritableUtils.toByteArray(w);
put.add(Bytes.toBytes("dom"),
Bytes.toBytes("domain"),
depthMapByteArray
);

   table.put(put);
} 

private static Object doType(Configuration conf, Object value,
  Class clazz)
  throws IOException {
ByteArrayOutputStream byteStream = new ByteArrayOutputStream();
DataOutputStream out = new DataOutputStream(byteStream);
HbaseObjectWritable.writeObject(out, value, clazz, conf);
out.close();
ByteArrayInputStream bais =
  new ByteArrayInputStream(byteStream.toByteArray());
DataInputStream dis = new DataInputStream(bais);
Object product = HbaseObjectWritable.readObject(dis, conf);
dis.close();
return product;
}

in hive i create a table

CREATE EXTERNAL TABLE hbase_table_2(row_key int, domain Array) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,dom:domain")
TBLPROPERTIES("hbase.table.name" = "table2");

in hbase_table_2   domain Array is not right. why?

> allow access to values stored as non-strings in HBase
> -
>
> Key: HIVE-1245
> URL: https://issues.apache.org/jira/browse/HIVE-1245
> Project: Hive
>  Issue Type: Improvement
>  Components: HBase Handler
>Affects Versions: 0.6.0
>Reporter: John Sichi
>Assignee: Basab Maulik
>
> See  test case in
> http://mail-archives.apache.org/mod_mbox/hadoop-hive-user/201003.mbox/browser

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira