[jira] Commented: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-23 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12913933#action_12913933
 ] 

Carl Steinbach commented on HIVE-1526:
--

@Todd: Can you please regenerate this patch? Both 'patch -p0' and 'git apply 
-p0' fail. Thanks.

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-1526.txt, libfb303.jar, libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-23 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1526:
-

Fix Version/s: 0.6.0
   0.7.0
  Component/s: Clients

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-1526.txt, libfb303.jar, libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1661) Default values for parameters

2010-09-23 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-1661:
---

Status: Resolved  (was: Patch Available)
Resolution: Fixed

committed! Thanks Siying!

 Default values for parameters
 -

 Key: HIVE-1661
 URL: https://issues.apache.org/jira/browse/HIVE-1661
 Project: Hadoop Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Siying Dong
 Fix For: 0.7.0

 Attachments: HIVE-1661.1.patch, HIVE-1661.2.patch


 It would be good to have a default value for some hive parameters:
 say RETENTION to be 30 days.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1364:
-

Attachment: HIVE-1364.3.patch.txt
HIVE-1364.3.backport-060.patch.txt

 Increase the maximum length of SERDEPROPERTIES values (currently 767 
 characters)
 

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Review Request: HIVE-1364: Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/895/
---

Review request for Hive Developers and John Sichi.


Summary
---

The patch increases the length of various properties in the Metastore OR 
mapping. Properties which are currently indexed, or which we may want to index 
in the future were increased to a length of 767 bytes. Properties which are not 
indexed and which we are unlikely to ever want to index were increased to a max 
length of 4000 bytes. I also removed the PK constraint on the COLUMNS.TYPE_NAME 
field.


This addresses bug HIVE-1364.
http://issues.apache.org/jira/browse/HIVE-1364


Diffs
-

  metastore/src/model/package.jdo 527f4b2 

Diff: http://review.cloudera.org/r/895/diff


Testing
---


Thanks,

Carl



[jira] Updated: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1364:
-

Status: Patch Available  (was: Open)

 Increase the maximum length of SERDEPROPERTIES values (currently 767 
 characters)
 

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914162#action_12914162
 ] 

HBase Review Board commented on HIVE-1364:
--

Message from: Carl Steinbach c...@cloudera.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/895/
---

Review request for Hive Developers and John Sichi.


Summary
---

The patch increases the length of various properties in the Metastore OR 
mapping. Properties which are currently indexed, or which we may want to index 
in the future were increased to a length of 767 bytes. Properties which are not 
indexed and which we are unlikely to ever want to index were increased to a max 
length of 4000 bytes. I also removed the PK constraint on the COLUMNS.TYPE_NAME 
field.


This addresses bug HIVE-1364.
http://issues.apache.org/jira/browse/HIVE-1364


Diffs
-

  metastore/src/model/package.jdo 527f4b2 

Diff: http://review.cloudera.org/r/895/diff


Testing
---


Thanks,

Carl




 Increase the maximum length of SERDEPROPERTIES values (currently 767 
 characters)
 

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1624) Patch to allows scripts in S3 location

2010-09-23 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914176#action_12914176
 ] 

He Yongqiang commented on HIVE-1624:


Should I modify it to be hdfs://anything || s3://anything like path?
Yes. That will be a great start. We can add more if needed in future.

Also please make sure if a program, neither hdfs nor s3 , can not be found 
locally, the query should not fail in semantic analyzer. Otherwise, it may 
break a lot of existing queries.

 Patch to allows scripts in S3 location
 --

 Key: HIVE-1624
 URL: https://issues.apache.org/jira/browse/HIVE-1624
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-1624-2.patch, HIVE-1624.patch


 I want to submit a patch which allows user to run scripts located in S3.
 This patch enables Hive to download the hive scripts located in S3 buckets 
 and execute them. This saves users the effort of copying scripts to HDFS 
 before executing them.
 Thanks
 Vaibhav

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914196#action_12914196
 ] 

John Sichi commented on HIVE-1364:
--

Per discussion in IRC, we should not change the precision for identifiers.  
Here's the revert list (plus one change for MStorageDescriptor's TYPE_NAME).

MFieldSchema
FNAME

MType
TYPE_NAME
FIELD_NAME

MTable
TBL_NAME
PKEY_NAME
PARAM_KEY
TBL_TYPE

MSerDeInfo
NAME
PARAM_KEY

MOrder
COL_NAME

MStorageDescriptor
COLUMN_NAME (all instances)
PARAM_KEY
TYPE_NAME should actually be 4000 since it's really a type signature, not a 
type name, and we're getting rid of the indexing for it

MPartition
PARAM_KEY

MIndex
INDEX_NAME
PARAM_KEY


 Increase the maximum length of SERDEPROPERTIES values (currently 767 
 characters)
 

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1666) retry metadata operation in case of an failure

2010-09-23 Thread Namit Jain (JIRA)
retry metadata operation in case of an failure
--

 Key: HIVE-1666
 URL: https://issues.apache.org/jira/browse/HIVE-1666
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Paul Yang


If a user is trying to insert into a partition,

insert overwrite table T partition (p) select ..


it is possible that the directory gets created, but the metadata creation of 
t...@p fails - 
currently, we will just throw an error. The final directory has been created.

It will be useful to at-least retry the metadata operation. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HIVE-1667) Store the group of the owner of the table in metastore

2010-09-23 Thread Namit Jain (JIRA)
Store the group of the owner of the table in metastore
--

 Key: HIVE-1667
 URL: https://issues.apache.org/jira/browse/HIVE-1667
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Namit Jain


Currently, the group of the owner of the table is not stored in the metastore.

Secondly, if you create a table, the table's owner group is set to the group 
for the parent. It is not read from the UGI passed in.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-23 Thread Steven Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Wong updated HIVE-1378:
--

Attachment: HIVE-1378.5.patch

New patch moves ArrayList construction and adds back schema mismatch check.

 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-23 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914236#action_12914236
 ] 

HBase Review Board commented on HIVE-1378:
--

Message from: Steven Wong skwsl...@gmail.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/828/
---

(Updated 2010-09-23 14:06:40.052031)


Review request for Hive Developers.


Changes
---

New patch moves ArrayList construction and adds back schema mismatch check.


Summary
---

HIVE-1606 is also fixed as a side effect.


This addresses bug HIVE-1378.
http://issues.apache.org/jira/browse/HIVE-1378


Diffs (updated)
-

  trunk/build.xml 999712 
  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 999712 
  trunk/conf/hive-default.xml 999712 
  trunk/data/files/datatypes.txt PRE-CREATION 
  trunk/data/scripts/input20_script 999712 
  trunk/eclipse-templates/.classpath 999712 
  trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveBaseResultSet.java 999712 
  trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveConnection.java 999712 
  trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveQueryResultSet.java 
999712 
  trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveResultSetMetaData.java 
999712 
  trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/JdbcColumn.java 999712 
  trunk/jdbc/src/test/org/apache/hadoop/hive/jdbc/TestJdbcDriver.java 999712 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java 999712 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
999712 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java 999712 
  trunk/ql/src/test/results/clientpositive/binary_output_format.q.out 999712 
  trunk/ql/src/test/results/compiler/plan/input20.q.xml 999712 
  trunk/ql/src/test/results/compiler/plan/input4.q.xml 999712 
  trunk/ql/src/test/results/compiler/plan/input5.q.xml 999712 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/DelimitedJSONSerDe.java 
PRE-CREATION 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.java 
999712 

Diff: http://review.cloudera.org/r/828/diff


Testing
---


Thanks,

Steven




 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Review Request: HIVE-1364: Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread Carl Steinbach

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/895/
---

(Updated 2010-09-23 14:45:46.673298)


Review request for Hive Developers and John Sichi.


Summary
---

The patch increases the length of various properties in the Metastore OR 
mapping. Properties which are currently indexed, or which we may want to index 
in the future were increased to a length of 767 bytes. Properties which are not 
indexed and which we are unlikely to ever want to index were increased to a max 
length of 4000 bytes. I also removed the PK constraint on the COLUMNS.TYPE_NAME 
field.


This addresses bug HIVE-1364.
http://issues.apache.org/jira/browse/HIVE-1364


Diffs (updated)
-

  metastore/src/model/package.jdo 527f4b2 

Diff: http://review.cloudera.org/r/895/diff


Testing
---


Thanks,

Carl



[jira] Updated: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1364:
-

Attachment: HIVE-1364.4.patch.txt
HIVE-1364.4.backport-060.patch.txt

Updated version of the patch with changes requested by John.

 Increase the maximum length of SERDEPROPERTIES values (currently 767 
 characters)
 

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, 
 HIVE-1364.4.backport-060.patch.txt, HIVE-1364.4.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread HBase Review Board (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914251#action_12914251
 ] 

HBase Review Board commented on HIVE-1364:
--

Message from: Carl Steinbach c...@cloudera.com

---
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/895/
---

(Updated 2010-09-23 14:45:46.673298)


Review request for Hive Developers and John Sichi.


Summary
---

The patch increases the length of various properties in the Metastore OR 
mapping. Properties which are currently indexed, or which we may want to index 
in the future were increased to a length of 767 bytes. Properties which are not 
indexed and which we are unlikely to ever want to index were increased to a max 
length of 4000 bytes. I also removed the PK constraint on the COLUMNS.TYPE_NAME 
field.


This addresses bug HIVE-1364.
http://issues.apache.org/jira/browse/HIVE-1364


Diffs (updated)
-

  metastore/src/model/package.jdo 527f4b2 

Diff: http://review.cloudera.org/r/895/diff


Testing
---


Thanks,

Carl




 Increase the maximum length of SERDEPROPERTIES values (currently 767 
 characters)
 

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, 
 HIVE-1364.4.backport-060.patch.txt, HIVE-1364.4.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1526) Hive should depend on a release version of Thrift

2010-09-23 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914255#action_12914255
 ] 

Carl Steinbach commented on HIVE-1526:
--

Sorry, that was a false alarm about the patch. Turns out the github Hive mirror 
lags the main repo by about a week.

@Todd: This patch introduces unsatisfied dependencies on slf4j-api and 
slf4j-log4j12. Can you please update the patch to pull these dependencies down 
with Ivy?

 Hive should depend on a release version of Thrift
 -

 Key: HIVE-1526
 URL: https://issues.apache.org/jira/browse/HIVE-1526
 Project: Hadoop Hive
  Issue Type: Task
  Components: Build Infrastructure, Clients
Reporter: Carl Steinbach
Assignee: Todd Lipcon
 Fix For: 0.6.0, 0.7.0

 Attachments: hive-1526.txt, libfb303.jar, libthrift.jar


 Hive should depend on a release version of Thrift, and ideally it should use 
 Ivy to resolve this dependency.
 The Thrift folks are working on adding Thrift artifacts to a maven repository 
 here: https://issues.apache.org/jira/browse/THRIFT-363

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Hive Contributors Meeting Monday October 25th @ Facebook

2010-09-23 Thread Carl Steinbach
Announcing a new Meetup for Hive Contributors Group!

*What*: Next Hive Contributors Meeting: Monday October 25th @
Facebookhttp://www.meetup.com/Hive-Contributors-Group/calendar/14875663/

*When*: Monday, October 25, 2010 4:30 PM

*Where*: Facebook HQ
1601 South California Avenue
Palo Alto, CA 94304

The next Hive Contributors Meeting is scheduled for October 25th from
4:30-6pm at Facebook's offices in Palo Alto. You must RSVP if you plan to
attend this event.

RSVP to this Meetup:
http://www.meetup.com/Hive-Contributors-Group/calendar/14875663/


[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-23 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914282#action_12914282
 ] 

Ning Zhang commented on HIVE-1378:
--

Changes look good. However there are conflicts when applying to the latest 
trunk. Can you generate a new one against the latest trunk? I'll start testing 
once I got the new patch. 

 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-23 Thread Steven Wong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Wong updated HIVE-1378:
--

Attachment: HIVE-1378.6.patch

Regenerated patch based on r1000539.

Interestingly, I haven't had any conflicts from svn up. How come you had 
conflicts?


 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.6.patch, HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-23 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914305#action_12914305
 ] 

Ning Zhang commented on HIVE-1378:
--

OK. This one applied cleanly. I'm starting testing. 

I think 'svn up' may be able to do more merging than 'patch'. I got the 
conflict on eclipse-templates/.classpath (it asked me whether I want to reverse 
apply) and another file. 

 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.6.patch, HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1264) Make Hive work with Hadoop security

2010-09-23 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914306#action_12914306
 ] 

John Sichi commented on HIVE-1264:
--

+1.  Will commit when tests pass.


 Make Hive work with Hadoop security
 ---

 Key: HIVE-1264
 URL: https://issues.apache.org/jira/browse/HIVE-1264
 Project: Hadoop Hive
  Issue Type: Improvement
Affects Versions: 0.7.0
Reporter: Jeff Hammerbacher
Assignee: Todd Lipcon
 Attachments: hive-1264-fb-mirror.txt, hive-1264.txt, 
 HiveHadoop20S_patch.patch




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1364) Increase the maximum length of SERDEPROPERTIES values (currently 767 characters)

2010-09-23 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914311#action_12914311
 ] 

John Sichi commented on HIVE-1364:
--

+1 from me pending metastore testing by Paul.


 Increase the maximum length of SERDEPROPERTIES values (currently 767 
 characters)
 

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, 
 HIVE-1364.4.backport-060.patch.txt, HIVE-1364.4.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1364) Increase the maximum length of various metastore fields, and remove TYPE_NAME from COLUMNS primary key

2010-09-23 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1364:
-

Summary: Increase the maximum length of various metastore fields, and 
remove TYPE_NAME from COLUMNS primary key  (was: Increase the maximum length of 
SERDEPROPERTIES values (currently 767 characters))

 Increase the maximum length of various metastore fields, and remove TYPE_NAME 
 from COLUMNS primary key
 --

 Key: HIVE-1364
 URL: https://issues.apache.org/jira/browse/HIVE-1364
 Project: Hadoop Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0
Reporter: Carl Steinbach
Assignee: Carl Steinbach
 Fix For: 0.6.0, 0.7.0

 Attachments: HIVE-1364.2.patch.txt, 
 HIVE-1364.3.backport-060.patch.txt, HIVE-1364.3.patch.txt, 
 HIVE-1364.4.backport-060.patch.txt, HIVE-1364.4.patch.txt, HIVE-1364.patch


 The value component of a SERDEPROPERTIES key/value pair is currently limited
 to a maximum length of 767 characters. I believe that the motivation for 
 limiting the length to 
 767 characters is that this value is the maximum allowed length of an index in
 a MySQL database running on the InnoDB engine: 
 http://bugs.mysql.com/bug.php?id=13315
 * The Metastore OR mapping currently limits many fields (including 
 SERDEPROPERTIES.PARAM_VALUE) to a maximum length of 767 characters despite 
 the fact that these fields are not indexed.
 * The maximum length of a VARCHAR value in MySQL 5.0.3 and later is 65,535.
 * We can expect many users to hit the 767 character limit on 
 SERDEPROPERTIES.PARAM_VALUE when using the hbase.columns.mapping 
 serdeproperty to map a table that has many columns.
 I propose increasing the maximum allowed length of 
 SERDEPROPERTIES.PARAM_VALUE to 8192.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1496) enhance CREATE INDEX to support immediate index build

2010-09-23 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914312#action_12914312
 ] 

John Sichi commented on HIVE-1496:
--

A related topic here is what happens when an INSERT is done to a table with an 
index on it.  For WITH DEFERRED REBUILD, it's necessary for the user to 
explicitly run the REBUILD.  For immediate rebuild, it should happen 
automatically for the partitions affected by the INSERT.

We can either do this now or treat it as a followup.


 enhance CREATE INDEX to support immediate index build
 -

 Key: HIVE-1496
 URL: https://issues.apache.org/jira/browse/HIVE-1496
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.7.0
Reporter: John Sichi
Assignee: Russell Melick
 Fix For: 0.7.0


 Currently we only support WITH DEFERRED REBUILD.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1624) Patch to allows scripts in S3 location

2010-09-23 Thread Vaibhav Aggarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914314#action_12914314
 ] 

Vaibhav Aggarwal commented on HIVE-1624:


Hi

I made the changes you had suggested.
Please review the patch.

Thanks
Vaibhav

 Patch to allows scripts in S3 location
 --

 Key: HIVE-1624
 URL: https://issues.apache.org/jira/browse/HIVE-1624
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-1624-2.patch, HIVE-1624-3.patch, HIVE-1624.patch


 I want to submit a patch which allows user to run scripts located in S3.
 This patch enables Hive to download the hive scripts located in S3 buckets 
 and execute them. This saves users the effort of copying scripts to HDFS 
 before executing them.
 Thanks
 Vaibhav

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1624) Patch to allows scripts in S3 location

2010-09-23 Thread Vaibhav Aggarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Aggarwal updated HIVE-1624:
---

Attachment: HIVE-1624-3.patch

 Patch to allows scripts in S3 location
 --

 Key: HIVE-1624
 URL: https://issues.apache.org/jira/browse/HIVE-1624
 Project: Hadoop Hive
  Issue Type: New Feature
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-1624-2.patch, HIVE-1624-3.patch, HIVE-1624.patch


 I want to submit a patch which allows user to run scripts located in S3.
 This patch enables Hive to download the hive scripts located in S3 buckets 
 and execute them. This saves users the effort of copying scripts to HDFS 
 before executing them.
 Thanks
 Vaibhav

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Build failed in Hudson: Hive-trunk-h0.20 #371

2010-09-23 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/371/changes

Changes:

[heyongqiang] HIVE-1661. Default values for parameters (Siying Dong via He 
Yongqiang)

[jvs] HIVE-1664. Eclipse build broken
Steven Wong via jvs)

--
[...truncated 14006 lines...]
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv3.txt
[junit] Loading data to table src1
[junit] POSTHOOK: Output: defa...@src1
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.seq
[junit] Loading data to table src_sequencefile
[junit] POSTHOOK: Output: defa...@src_sequencefile
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/complex.seq
[junit] Loading data to table src_thrift
[junit] POSTHOOK: Output: defa...@src_thrift
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/json.txt
[junit] Loading data to table src_json
[junit] POSTHOOK: Output: defa...@src_json
[junit] OK
[junit] diff 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/build/ql/test/logs/negative/unknown_table1.q.out
 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/ql/src/test/results/compiler/errors/unknown_table1.q.out
[junit] Done query: unknown_table1.q
[junit] Begin query: unknown_table2.q
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=11
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-08, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-08/hr=12
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=11)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=11
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table srcpart partition (ds=2008-04-09, hr=12)
[junit] POSTHOOK: Output: defa...@srcpart@ds=2008-04-09/hr=12
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket0.txt
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket1.txt
[junit] Loading data to table srcbucket
[junit] POSTHOOK: Output: defa...@srcbucket
[junit] OK
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket20.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket21.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket22.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/srcbucket23.txt
[junit] Loading data to table srcbucket2
[junit] POSTHOOK: Output: defa...@srcbucket2
[junit] OK
[junit] Copying data from 
https://hudson.apache.org/hudson/job/Hive-trunk-h0.20/ws/hive/data/files/kv1.txt
[junit] Loading data to table src
[junit] POSTHOOK: Output: defa...@src
[junit] OK
[junit] Copying data from 

[jira] Work started: (HIVE-1641) add map joined table to distributed cache

2010-09-23 Thread Liyin Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-1641 started by Liyin Tang.

 add map joined table to distributed cache
 -

 Key: HIVE-1641
 URL: https://issues.apache.org/jira/browse/HIVE-1641
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang
 Fix For: 0.7.0


 Currently, the mappers directly read the map-joined table from HDFS, which 
 makes it difficult to scale.
 We end up getting lots of timeouts once the number of mappers are beyond a 
 few thousand, due to 
 concurrent mappers.
 It would be good idea to put the mapped file into distributed cache and read 
 from there instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HIVE-1659) parse_url_tuple: a UDTF version of parse_url

2010-09-23 Thread Xing Jin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xing Jin updated HIVE-1659:
---

   Status: Patch Available  (was: Open)
Affects Version/s: 0.5.0

Add new function parse_url_tuple for HIVE.

 parse_url_tuple:  a UDTF version of parse_url
 -

 Key: HIVE-1659
 URL: https://issues.apache.org/jira/browse/HIVE-1659
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.5.0
Reporter: Ning Zhang

 The UDF parse_url take s a URL, parse it and extract QUERY/PATH etc from it. 
 However it can only extract an atomic value from the URL. If we want to 
 extract multiple piece of information, we need to call the function many 
 times. It is desirable to parse the URL once and extract all needed 
 information and return a tuple in a UDTF. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1641) add map joined table to distributed cache

2010-09-23 Thread Liyin Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914332#action_12914332
 ] 

Liyin Tang commented on HIVE-1641:
--

Right now, the local work is only for processing small tables for map join 
operation. Also one MapredTask can at most have one map join operation. Because 
if one map join followed by anther map join, they will be split into 2 tasks. 
So one MapredTask can at most one local work to do. 

One feasible solution is to create a new type of task, named MapredLocalTask, 
which is to do some MapredLocalWork (local work). If one MapredTask has a local 
work to do, then create a new MapredLocal Task for this local work, let the 
current MapredTask depends on this new generated Task, and let this new 
generated task depends on the parent tasks of the current task.

In this new MapredLocalTask, it does the local work only once and generate the 
mapped file(JDBM file). Next step is to put the new generated mapped file into 
distributed cache. All the mappers will 
read this file from the distributed cache and construct the in memory hash 
table based on this file.

Any comments are so welcome:)


 add map joined table to distributed cache
 -

 Key: HIVE-1641
 URL: https://issues.apache.org/jira/browse/HIVE-1641
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain
Assignee: Liyin Tang
 Fix For: 0.7.0


 Currently, the mappers directly read the map-joined table from HDFS, which 
 makes it difficult to scale.
 We end up getting lots of timeouts once the number of mappers are beyond a 
 few thousand, due to 
 concurrent mappers.
 It would be good idea to put the mapped file into distributed cache and read 
 from there instead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-23 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914335#action_12914335
 ] 

Ning Zhang commented on HIVE-1378:
--

Steven, tests passed for hadoop 0.20, but it failed to compile on hadoop 0.17 
(ant clean package -Dhadoop.version=0.17.2.1). Can you take a look?

 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.6.patch, HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1659) parse_url_tuple: a UDTF version of parse_url

2010-09-23 Thread Ning Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914337#action_12914337
 ] 

Ning Zhang commented on HIVE-1659:
--

Xing, the patch was not attached. Can you use the link Attach file in the 
left pane?


 parse_url_tuple:  a UDTF version of parse_url
 -

 Key: HIVE-1659
 URL: https://issues.apache.org/jira/browse/HIVE-1659
 Project: Hadoop Hive
  Issue Type: New Feature
Affects Versions: 0.5.0
Reporter: Ning Zhang

 The UDF parse_url take s a URL, parse it and extract QUERY/PATH etc from it. 
 However it can only extract an atomic value from the URL. If we want to 
 extract multiple piece of information, we need to call the function many 
 times. It is desirable to parse the URL once and extract all needed 
 information and return a tuple in a UDTF. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HIVE-1378) Return value for map, array, and struct needs to return a string

2010-09-23 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12914341#action_12914341
 ] 

John Sichi commented on HIVE-1378:
--

We're supposed to drop support for pre-0.20 Hadoop versions anyway...maybe now 
is a good time?


 Return value for map, array, and struct needs to return a string 
 -

 Key: HIVE-1378
 URL: https://issues.apache.org/jira/browse/HIVE-1378
 Project: Hadoop Hive
  Issue Type: Improvement
  Components: Drivers
Reporter: Jerome Boulon
Assignee: Steven Wong
 Fix For: 0.7.0

 Attachments: HIVE-1378.1.patch, HIVE-1378.2.patch, HIVE-1378.3.patch, 
 HIVE-1378.4.patch, HIVE-1378.5.patch, HIVE-1378.6.patch, HIVE-1378.patch


 In order to be able to select/display any data from JDBC Hive driver, return 
 value for map, array, and struct needs to return a string

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.