[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables

2014-08-20 Thread Amareshwari Sriramadasu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-7629:
--

Assignee: Suma Shivaprasad

 Problem in SMB Joins between two Parquet tables
 ---

 Key: HIVE-7629
 URL: https://issues.apache.org/jira/browse/HIVE-7629
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Suma Shivaprasad
Assignee: Suma Shivaprasad
  Labels: Parquet
 Fix For: 0.14.0

 Attachments: HIVE-7629.1.patch, HIVE-7629.patch


 The issue is clearly seen when two bucketed and sorted parquet tables with 
 different number of columns are involved in the join . The following 
 exception is seen
 {noformat}
 Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables

2014-08-20 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7629:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thank you so much for your contribution! I have committed this to trunk!

 Problem in SMB Joins between two Parquet tables
 ---

 Key: HIVE-7629
 URL: https://issues.apache.org/jira/browse/HIVE-7629
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Suma Shivaprasad
Assignee: Suma Shivaprasad
  Labels: Parquet
 Fix For: 0.14.0

 Attachments: HIVE-7629.1.patch, HIVE-7629.patch


 The issue is clearly seen when two bucketed and sorted parquet tables with 
 different number of columns are involved in the join . The following 
 exception is seen
 {noformat}
 Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables

2014-08-12 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated HIVE-7629:
---

Attachment: HIVE-7629.1.patch

 Problem in SMB Joins between two Parquet tables
 ---

 Key: HIVE-7629
 URL: https://issues.apache.org/jira/browse/HIVE-7629
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Suma Shivaprasad
  Labels: Parquet
 Fix For: 0.14.0

 Attachments: HIVE-7629.1.patch, HIVE-7629.patch


 The issue is clearly seen when two bucketed and sorted parquet tables with 
 different number of columns are involved in the join . The following 
 exception is seen
 Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables

2014-08-12 Thread Brock Noland (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7629:
---

Description: 
The issue is clearly seen when two bucketed and sorted parquet tables with 
different number of columns are involved in the join . The following exception 
is seen

{noformat}
Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
at 
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)
{noformat}

  was:
The issue is clearly seen when two bucketed and sorted parquet tables with 
different number of columns are involved in the join . The following exception 
is seen

Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at 
org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
at 
org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
at 
org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)


 Problem in SMB Joins between two Parquet tables
 ---

 Key: HIVE-7629
 URL: https://issues.apache.org/jira/browse/HIVE-7629
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Suma Shivaprasad
  Labels: Parquet
 Fix For: 0.14.0

 Attachments: HIVE-7629.1.patch, HIVE-7629.patch


 The issue is clearly seen when two bucketed and sorted parquet tables with 
 different number of columns are involved in the join . The following 
 exception is seen
 {noformat}
 Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables

2014-08-07 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated HIVE-7629:
---

Attachment: parquet_smb_join.patch

 Problem in SMB Joins between two Parquet tables
 ---

 Key: HIVE-7629
 URL: https://issues.apache.org/jira/browse/HIVE-7629
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Suma Shivaprasad
 Attachments: HIVE-7629.patch


 The issue is clearly seen when two bucketed and sorted parquet tables with 
 different number of columns are involved in the join . The following 
 exception is seen
 Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables

2014-08-07 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated HIVE-7629:
---

Attachment: HIVE-7629.patch

 Problem in SMB Joins between two Parquet tables
 ---

 Key: HIVE-7629
 URL: https://issues.apache.org/jira/browse/HIVE-7629
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Suma Shivaprasad
 Attachments: HIVE-7629.patch


 The issue is clearly seen when two bucketed and sorted parquet tables with 
 different number of columns are involved in the join . The following 
 exception is seen
 Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables

2014-08-07 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated HIVE-7629:
---

Fix Version/s: 0.14.0
   Status: Patch Available  (was: Open)

 Problem in SMB Joins between two Parquet tables
 ---

 Key: HIVE-7629
 URL: https://issues.apache.org/jira/browse/HIVE-7629
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Suma Shivaprasad
 Fix For: 0.14.0

 Attachments: HIVE-7629.patch


 The issue is clearly seen when two bucketed and sorted parquet tables with 
 different number of columns are involved in the join . The following 
 exception is seen
 Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables

2014-08-07 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated HIVE-7629:
---

Labels: Parquet  (was: )

 Problem in SMB Joins between two Parquet tables
 ---

 Key: HIVE-7629
 URL: https://issues.apache.org/jira/browse/HIVE-7629
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Suma Shivaprasad
  Labels: Parquet
 Fix For: 0.14.0

 Attachments: HIVE-7629.patch


 The issue is clearly seen when two bucketed and sorted parquet tables with 
 different number of columns are involved in the join . The following 
 exception is seen
 Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables

2014-08-07 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated HIVE-7629:
---

Attachment: (was: parquet_smb_join.patch)

 Problem in SMB Joins between two Parquet tables
 ---

 Key: HIVE-7629
 URL: https://issues.apache.org/jira/browse/HIVE-7629
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Suma Shivaprasad
 Attachments: HIVE-7629.patch


 The issue is clearly seen when two bucketed and sorted parquet tables with 
 different number of columns are involved in the join . The following 
 exception is seen
 Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (HIVE-7629) Problem in SMB Joins between two Parquet tables

2014-08-06 Thread Suma Shivaprasad (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suma Shivaprasad updated HIVE-7629:
---

Affects Version/s: (was: 0.13.1)
   0.13.0

 Problem in SMB Joins between two Parquet tables
 ---

 Key: HIVE-7629
 URL: https://issues.apache.org/jira/browse/HIVE-7629
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.13.0
Reporter: Suma Shivaprasad

 The issue is clearly seen when two bucketed and sorted parquet tables with 
 different number of columns are involved in the join . The following 
 exception is seen
 Caused by: java.lang.IndexOutOfBoundsException: Index: 2, Size: 2
 at java.util.ArrayList.rangeCheck(ArrayList.java:635)
 at java.util.ArrayList.get(ArrayList.java:411)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:101)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:204)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:79)
 at 
 org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.init(ParquetRecordReaderWrapper.java:66)
 at 
 org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51)
 at 
 org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:65)



--
This message was sent by Atlassian JIRA
(v6.2#6252)