[jira] [Updated] (HIVE-4969) HCatalog HBaseHCatStorageHandler is not returning all the data

2013-10-06 Thread Thejas M Nair (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thejas M Nair updated HIVE-4969:


Fix Version/s: (was: 0.11.1)
   (was: 0.12.0)

Preparing for 0.12 release. Removing fix version of 0.12 for those that are not 
in 0.12 branch.


 HCatalog HBaseHCatStorageHandler is not returning all the data
 --

 Key: HIVE-4969
 URL: https://issues.apache.org/jira/browse/HIVE-4969
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Venki Korukanti
Priority: Critical
 Attachments: HIVE-4969-1.patch


 Repro steps:
 1) Create an HCatalog table mapped to HBase table.
 hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float)
  STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
  TBLPROPERTIES('hbase.table.name' ='studentHBase',  
'hbase.columns.mapping' =
 ':key,onecf:name,twocf:age,threecf:gpa');
 2) Load the following data from Pig.
 cat student_data
 1^Asarah laertes^A23^A2.40
 2^Atom allen^A72^A1.57
 3^Abob ovid^A61^A2.67
 4^Aethan nixon^A38^A2.15
 5^Acalvin robinson^A28^A2.53
 6^Airene ovid^A65^A2.56
 7^Ayuri garcia^A36^A1.65
 8^Acalvin nixon^A41^A1.04
 9^Ajessica davidson^A48^A2.11
 10^Akatie king^A39^A1.05
 grunt A = LOAD 'student_data' AS 
 (rownum:int,name:chararray,age:int,gpa:float);
 grunt STORE A INTO 'studentHCat' USING org.apache.hcatalog.pig.HCatStorer();
 3) Now from HBase do a scan on the studentHBase table
 hbase(main):026:0 scan 'studentPig', {LIMIT = 5}
 4) From pig access the data in table
 grunt A = LOAD 'studentHCat' USING org.apache.hcatalog.pig.HCatLoader();
 grunt STORE A INTO '/user/root/studentPig';
 5) Verify the output written in StudentPig
 hadoop fs -cat /user/root/studentPig/part-r-0
 1  23
 2  72
 3  61
 4  38
 5  28
 6  65
 7  36
 8  41
 9  48
 10 39
 The data returned has only two fields (rownum and age).
 Problem:
 While reading the data from HBase table, HbaseSnapshotRecordReader gets data 
 row in Result (org.apache.hadoop.hbase.client.Result) object and processes 
 the KeyValue fields in it. After processing, it creates another Result object 
 out of the processed KeyValue array. Problem here is KeyValue array is not 
 sorted. Result object expects the input KeyValue array to have sorted 
 elements. When we call the Result.getValue() it returns no value for some of 
 the fields as it does a binary search on un-ordered array.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4969) HCatalog HBaseHCatStorageHandler is not returning all the data

2013-09-03 Thread Sushanth Sowmyan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sushanth Sowmyan updated HIVE-4969:
---

Status: Open  (was: Patch Available)

Canceling patch as HIVE-4869 package cleanup will invalidate all current 
HCatalog patches and will require patch regeneration.

 HCatalog HBaseHCatStorageHandler is not returning all the data
 --

 Key: HIVE-4969
 URL: https://issues.apache.org/jira/browse/HIVE-4969
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Venki Korukanti
Priority: Critical
 Fix For: 0.11.1, 0.12.0

 Attachments: HIVE-4969-1.patch


 Repro steps:
 1) Create an HCatalog table mapped to HBase table.
 hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float)
  STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
  TBLPROPERTIES('hbase.table.name' ='studentHBase',  
'hbase.columns.mapping' =
 ':key,onecf:name,twocf:age,threecf:gpa');
 2) Load the following data from Pig.
 cat student_data
 1^Asarah laertes^A23^A2.40
 2^Atom allen^A72^A1.57
 3^Abob ovid^A61^A2.67
 4^Aethan nixon^A38^A2.15
 5^Acalvin robinson^A28^A2.53
 6^Airene ovid^A65^A2.56
 7^Ayuri garcia^A36^A1.65
 8^Acalvin nixon^A41^A1.04
 9^Ajessica davidson^A48^A2.11
 10^Akatie king^A39^A1.05
 grunt A = LOAD 'student_data' AS 
 (rownum:int,name:chararray,age:int,gpa:float);
 grunt STORE A INTO 'studentHCat' USING org.apache.hcatalog.pig.HCatStorer();
 3) Now from HBase do a scan on the studentHBase table
 hbase(main):026:0 scan 'studentPig', {LIMIT = 5}
 4) From pig access the data in table
 grunt A = LOAD 'studentHCat' USING org.apache.hcatalog.pig.HCatLoader();
 grunt STORE A INTO '/user/root/studentPig';
 5) Verify the output written in StudentPig
 hadoop fs -cat /user/root/studentPig/part-r-0
 1  23
 2  72
 3  61
 4  38
 5  28
 6  65
 7  36
 8  41
 9  48
 10 39
 The data returned has only two fields (rownum and age).
 Problem:
 While reading the data from HBase table, HbaseSnapshotRecordReader gets data 
 row in Result (org.apache.hadoop.hbase.client.Result) object and processes 
 the KeyValue fields in it. After processing, it creates another Result object 
 out of the processed KeyValue array. Problem here is KeyValue array is not 
 sorted. Result object expects the input KeyValue array to have sorted 
 elements. When we call the Result.getValue() it returns no value for some of 
 the fields as it does a binary search on un-ordered array.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4969) HCatalog HBaseHCatStorageHandler is not returning all the data

2013-07-31 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated HIVE-4969:
--

Description: 
Repro steps:
1) Create an HCatalog table mapped to HBase table.

hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float)
 STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
 TBLPROPERTIES('hbase.table.name' ='studentHBase',  
   'hbase.columns.mapping' =
':key,onecf:name,twocf:age,threecf:gpa');


2) Load the following data from Pig.

cat student_data
1^Asarah laertes^A23^A2.40
2^Atom allen^A72^A1.57
3^Abob ovid^A61^A2.67
4^Aethan nixon^A38^A2.15
5^Acalvin robinson^A28^A2.53
6^Airene ovid^A65^A2.56
7^Ayuri garcia^A36^A1.65
8^Acalvin nixon^A41^A1.04
9^Ajessica davidson^A48^A2.11
10^Akatie king^A39^A1.05


grunt A = LOAD 'student_data' AS (rownum:int,name:chararray,age:int,gpa:float);

grunt STORE A INTO 'studentHCat' USING org.apache.hcatalog.pig.HCatStorer();

3) Now from HBase do a scan on the studentHBase table
hbase(main):026:0 scan 'studentPig', {LIMIT = 5}

4) From pig access the data in table
grunt A = LOAD 'studentHCat' USING org.apache.hcatalog.pig.HCatLoader();
grunt STORE A INTO '/user/root/studentPig';


5) Verify the output written in StudentPig
hadoop fs -cat /user/root/studentPig/part-r-0
1  23
2  72
3  61
4  38
5  28
6  65
7  36
8  41
9  48
10 39

The data returned has only two fields (rownum and age).


Problem:
While reading the data from HBase table, HbaseSnapshotRecordReader gets data 
row in Result (org.apache.hadoop.hbase.client.Result) object and processes the 
KeyValue fields in it. After processing, it creates another Result object out 
of the processed KeyValue array. Problem here is KeyValue array is not sorted. 
Result object expects the input KeyValue array to have sorted elements. When we 
call the Result.getValue() it returns no value for some of the fields as it 
does a binary search on un-ordered array.










  was:

Repro steps:
1) Create an HCatalog table mapped to HBase table.

hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float)
 STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
 TBLPROPERTIES('hbase.table.name' ='studentHBase',  
   'hbase.columns.mapping' =
':key,onecf:name,twocf:age,threecf:gpa');


2) Load the following data from Pig.

cat student_data
1^Asarah laertes^A23^A2.40
2^Atom allen^A72^A1.57
3^Abob ovid^A61^A2.67
4^Aethan nixon^A38^A2.15
5^Acalvin robinson^A28^A2.53
6^Airene ovid^A65^A2.56
7^Ayuri garcia^A36^A1.65
8^Acalvin nixon^A41^A1.04
9^Ajessica davidson^A48^A2.11
10^Akatie king^A39^A1.05


grunt A = LOAD 'student_data' AS (rownum:int,name:chararray,age:int,gpa:float);

grunt STORE A INTO 'studentHCat' USING org.apache.hcatalog.pig.HCatStorer();

3) Now from HBase do a scan on the studentHBase table
hbase(main):026:0 scan 'studentPig', {LIMIT = 5}

4) From pig access the data in table
grunt A = LOAD 'studentHCat' USING org.apache.hcatalog.pig.HCatLoader();
grunt STORE A INTO '/user/root/studentPig';


5) Verify the output written in StudentPig
hadoop fs -cat /user/root/studentPig/part-r-0
1  23
2  72
3  61
4  38
5  28
6  65
7  36
8  41
9  48
10 39

The data returned only two fields (rownum and age).


Problem:
While reading the data from HBase table, HbaseSnapshotRecordReader gets data 
row in Result (org.apache.hadoop.hbase.client.Result) object and processes the 
KeyValue fields in it. After processing it creates another Result object out of 
the processed KeyValue array. Problem here is KeyValue array is not sorted. 
Result object expects the input KeyValue array to have sorted elements. When we 
call the Result.getValue() it returns no value for some of the fields as it 
does a binary search on unordered array.











 HCatalog HBaseHCatStorageHandler is not returning all the data
 --

 Key: HIVE-4969
 URL: https://issues.apache.org/jira/browse/HIVE-4969
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Venki Korukanti
Priority: Critical

 Repro steps:
 1) Create an HCatalog table mapped to HBase table.
 hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float)
  STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
  TBLPROPERTIES('hbase.table.name' ='studentHBase',  
'hbase.columns.mapping' =  

[jira] [Updated] (HIVE-4969) HCatalog HBaseHCatStorageHandler is not returning all the data

2013-07-31 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated HIVE-4969:
--

Attachment: HIVE-4969-1.patch

 HCatalog HBaseHCatStorageHandler is not returning all the data
 --

 Key: HIVE-4969
 URL: https://issues.apache.org/jira/browse/HIVE-4969
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Venki Korukanti
Priority: Critical
 Fix For: 0.11.1, 0.12.0

 Attachments: HIVE-4969-1.patch


 Repro steps:
 1) Create an HCatalog table mapped to HBase table.
 hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float)
  STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
  TBLPROPERTIES('hbase.table.name' ='studentHBase',  
'hbase.columns.mapping' =
 ':key,onecf:name,twocf:age,threecf:gpa');
 2) Load the following data from Pig.
 cat student_data
 1^Asarah laertes^A23^A2.40
 2^Atom allen^A72^A1.57
 3^Abob ovid^A61^A2.67
 4^Aethan nixon^A38^A2.15
 5^Acalvin robinson^A28^A2.53
 6^Airene ovid^A65^A2.56
 7^Ayuri garcia^A36^A1.65
 8^Acalvin nixon^A41^A1.04
 9^Ajessica davidson^A48^A2.11
 10^Akatie king^A39^A1.05
 grunt A = LOAD 'student_data' AS 
 (rownum:int,name:chararray,age:int,gpa:float);
 grunt STORE A INTO 'studentHCat' USING org.apache.hcatalog.pig.HCatStorer();
 3) Now from HBase do a scan on the studentHBase table
 hbase(main):026:0 scan 'studentPig', {LIMIT = 5}
 4) From pig access the data in table
 grunt A = LOAD 'studentHCat' USING org.apache.hcatalog.pig.HCatLoader();
 grunt STORE A INTO '/user/root/studentPig';
 5) Verify the output written in StudentPig
 hadoop fs -cat /user/root/studentPig/part-r-0
 1  23
 2  72
 3  61
 4  38
 5  28
 6  65
 7  36
 8  41
 9  48
 10 39
 The data returned has only two fields (rownum and age).
 Problem:
 While reading the data from HBase table, HbaseSnapshotRecordReader gets data 
 row in Result (org.apache.hadoop.hbase.client.Result) object and processes 
 the KeyValue fields in it. After processing, it creates another Result object 
 out of the processed KeyValue array. Problem here is KeyValue array is not 
 sorted. Result object expects the input KeyValue array to have sorted 
 elements. When we call the Result.getValue() it returns no value for some of 
 the fields as it does a binary search on un-ordered array.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4969) HCatalog HBaseHCatStorageHandler is not returning all the data

2013-07-31 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated HIVE-4969:
--

Attachment: (was: HIVE-4969-1.patch)

 HCatalog HBaseHCatStorageHandler is not returning all the data
 --

 Key: HIVE-4969
 URL: https://issues.apache.org/jira/browse/HIVE-4969
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Venki Korukanti
Priority: Critical
 Fix For: 0.11.1, 0.12.0


 Repro steps:
 1) Create an HCatalog table mapped to HBase table.
 hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float)
  STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
  TBLPROPERTIES('hbase.table.name' ='studentHBase',  
'hbase.columns.mapping' =
 ':key,onecf:name,twocf:age,threecf:gpa');
 2) Load the following data from Pig.
 cat student_data
 1^Asarah laertes^A23^A2.40
 2^Atom allen^A72^A1.57
 3^Abob ovid^A61^A2.67
 4^Aethan nixon^A38^A2.15
 5^Acalvin robinson^A28^A2.53
 6^Airene ovid^A65^A2.56
 7^Ayuri garcia^A36^A1.65
 8^Acalvin nixon^A41^A1.04
 9^Ajessica davidson^A48^A2.11
 10^Akatie king^A39^A1.05
 grunt A = LOAD 'student_data' AS 
 (rownum:int,name:chararray,age:int,gpa:float);
 grunt STORE A INTO 'studentHCat' USING org.apache.hcatalog.pig.HCatStorer();
 3) Now from HBase do a scan on the studentHBase table
 hbase(main):026:0 scan 'studentPig', {LIMIT = 5}
 4) From pig access the data in table
 grunt A = LOAD 'studentHCat' USING org.apache.hcatalog.pig.HCatLoader();
 grunt STORE A INTO '/user/root/studentPig';
 5) Verify the output written in StudentPig
 hadoop fs -cat /user/root/studentPig/part-r-0
 1  23
 2  72
 3  61
 4  38
 5  28
 6  65
 7  36
 8  41
 9  48
 10 39
 The data returned has only two fields (rownum and age).
 Problem:
 While reading the data from HBase table, HbaseSnapshotRecordReader gets data 
 row in Result (org.apache.hadoop.hbase.client.Result) object and processes 
 the KeyValue fields in it. After processing, it creates another Result object 
 out of the processed KeyValue array. Problem here is KeyValue array is not 
 sorted. Result object expects the input KeyValue array to have sorted 
 elements. When we call the Result.getValue() it returns no value for some of 
 the fields as it does a binary search on un-ordered array.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4969) HCatalog HBaseHCatStorageHandler is not returning all the data

2013-07-31 Thread Venki Korukanti (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venki Korukanti updated HIVE-4969:
--

Attachment: HIVE-4969-1.patch

 HCatalog HBaseHCatStorageHandler is not returning all the data
 --

 Key: HIVE-4969
 URL: https://issues.apache.org/jira/browse/HIVE-4969
 Project: Hive
  Issue Type: Bug
  Components: HCatalog
Affects Versions: 0.11.0
Reporter: Venki Korukanti
Priority: Critical
 Fix For: 0.11.1, 0.12.0

 Attachments: HIVE-4969-1.patch


 Repro steps:
 1) Create an HCatalog table mapped to HBase table.
 hcat -e CREATE TABLE studentHCat(rownum int, name string, age int, gpa float)
  STORED BY 'org.apache.hcatalog.hbase.HBaseHCatStorageHandler'
  TBLPROPERTIES('hbase.table.name' ='studentHBase',  
'hbase.columns.mapping' =
 ':key,onecf:name,twocf:age,threecf:gpa');
 2) Load the following data from Pig.
 cat student_data
 1^Asarah laertes^A23^A2.40
 2^Atom allen^A72^A1.57
 3^Abob ovid^A61^A2.67
 4^Aethan nixon^A38^A2.15
 5^Acalvin robinson^A28^A2.53
 6^Airene ovid^A65^A2.56
 7^Ayuri garcia^A36^A1.65
 8^Acalvin nixon^A41^A1.04
 9^Ajessica davidson^A48^A2.11
 10^Akatie king^A39^A1.05
 grunt A = LOAD 'student_data' AS 
 (rownum:int,name:chararray,age:int,gpa:float);
 grunt STORE A INTO 'studentHCat' USING org.apache.hcatalog.pig.HCatStorer();
 3) Now from HBase do a scan on the studentHBase table
 hbase(main):026:0 scan 'studentPig', {LIMIT = 5}
 4) From pig access the data in table
 grunt A = LOAD 'studentHCat' USING org.apache.hcatalog.pig.HCatLoader();
 grunt STORE A INTO '/user/root/studentPig';
 5) Verify the output written in StudentPig
 hadoop fs -cat /user/root/studentPig/part-r-0
 1  23
 2  72
 3  61
 4  38
 5  28
 6  65
 7  36
 8  41
 9  48
 10 39
 The data returned has only two fields (rownum and age).
 Problem:
 While reading the data from HBase table, HbaseSnapshotRecordReader gets data 
 row in Result (org.apache.hadoop.hbase.client.Result) object and processes 
 the KeyValue fields in it. After processing, it creates another Result object 
 out of the processed KeyValue array. Problem here is KeyValue array is not 
 sorted. Result object expects the input KeyValue array to have sorted 
 elements. When we call the Result.getValue() it returns no value for some of 
 the fields as it does a binary search on un-ordered array.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira