[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stamatis Zampetakis updated HIVE-18265: --- Fix Version/s: (was: 3.2.0) I cleared the fixVersion field since this ticket is not resolved. Please review this ticket and if the fix is already committed to a specific version please set the version accordingly and mark the ticket as RESOLVED. According to the JIRA guidelines (https://cwiki.apache.org/confluence/display/Hive/HowToContribute) the fixVersion should be set only when the issue is resolved/closed. > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1, 3.1.0 >Reporter: Hui Huang >Assignee: Hui Huang >Priority: Major > Attachments: HIVE-18265.1.patch, HIVE-18265.2.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-18265: --- Fix Version/s: (was: 3.1.0) 3.2.0 Deferring this to 3.2.0 since the branch for 3.1.0 has been cut off. > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1, 3.1.0 >Reporter: Hui Huang >Assignee: Hui Huang >Priority: Major > Fix For: 3.2.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.2.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Status: Open (was: Patch Available) > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1, 3.1.0 >Reporter: Hui Huang >Assignee: Hui Huang >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.2.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Status: Patch Available (was: Open) > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1, 3.1.0 >Reporter: Hui Huang >Assignee: Hui Huang >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.2.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Attachment: HIVE-18265.2.patch > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1, 3.1.0 >Reporter: Hui Huang >Assignee: Hui Huang >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.2.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Target Version/s: 3.1.0 (was: 3.0.0) > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1, 3.1.0 >Reporter: Hui Huang >Assignee: Hui Huang >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Affects Version/s: (was: 3.0.0) 3.1.0 > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1, 3.1.0 >Reporter: Hui Huang >Assignee: Hui Huang >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vineet Garg updated HIVE-18265: --- Fix Version/s: (was: 3.0.0) 3.1.0 Deferring this to 3.1.0 since the branch for 3.0.0 has been cut off. Please update the JIRA if you would like to get your patch in 3.0.0. > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1, 3.0.0 >Reporter: Hui Huang >Assignee: Hui Huang >Priority: Major > Fix For: 3.1.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Affects Version/s: 1.2.1 > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 1.2.1, 3.0.0 >Reporter: Hui Huang >Assignee: Hui Huang > Fix For: 3.0.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Status: Patch Available (was: Open) > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 3.0.0 >Reporter: Hui Huang >Assignee: Hui Huang > Fix For: 3.0.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Status: In Progress (was: Patch Available) > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 3.0.0 >Reporter: Hui Huang >Assignee: Hui Huang > Fix For: 3.0.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Status: Patch Available (was: Open) Add some test cases. > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 3.0.0 >Reporter: Hui Huang >Assignee: Hui Huang > Fix For: 3.0.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Attachment: HIVE-18265.1.patch > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 3.0.0 >Reporter: Hui Huang >Assignee: Hui Huang > Fix For: 3.0.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Status: Open (was: Patch Available) > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 3.0.0 >Reporter: Hui Huang >Assignee: Hui Huang > Fix For: 3.0.0 > > Attachments: HIVE-18265.1.patch, HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Status: Patch Available (was: Open) Hi, all~ I've tried the following methods: 1. Modify the HiveLexer.g and HiveParser.g, but failed 2. Replace tab character with a space character 3. Check the comment during semantic analyzing and throw semantic exception At last, I took the third one. > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 3.0.0 >Reporter: Hui Huang >Assignee: Hui Huang > Fix For: 3.0.0 > > Attachments: HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-18265) desc formatted/extended or show create table can not fully display the result when field or table comment contains tab character
[ https://issues.apache.org/jira/browse/HIVE-18265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui Huang updated HIVE-18265: - Attachment: HIVE-18265.patch > desc formatted/extended or show create table can not fully display the result > when field or table comment contains tab character > > > Key: HIVE-18265 > URL: https://issues.apache.org/jira/browse/HIVE-18265 > Project: Hive > Issue Type: Bug > Components: CLI >Affects Versions: 3.0.0 >Reporter: Hui Huang >Assignee: Hui Huang > Fix For: 3.0.0 > > Attachments: HIVE-18265.patch > > > Here are some examples: > create table test_comment (id1 string comment 'full_\tname1', id2 string > comment 'full_\tname2', id3 string comment 'full_\tname3') stored as textfile; > When execute `show create table test_comment`, we can see the following > content in the console, > {quote} > createtab_stmt > CREATE TABLE `test_comment`( > `id1` string COMMENT 'full_ > `id2` string COMMENT 'full_ > `id3` string COMMENT 'full_ > ROW FORMAT SERDE > 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' > STORED AS INPUTFORMAT > 'org.apache.hadoop.mapred.TextInputFormat' > OUTPUTFORMAT > 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' > LOCATION > 'hdfs://xxx/user/huanghui/warehouse/huanghuitest.db/test_comment' > TBLPROPERTIES ( > 'transient_lastDdlTime'='1513095570') > {quote} > And the output of `desc formatted table ` is a little similar, > {quote} > col_name data_type comment > \# col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > \# Detailed Table Information > (ignore)... > {quote} > When execute `desc extended test_comment`, the problem is more obvious, > {quote} > col_name data_type comment > id1 string full_ > id2 string full_ > id3 string full_ > Detailed Table InformationTable(tableName:test_comment, > dbName:huanghuitest, owner:huanghui, createTime:1513095570, lastAccessTime:0, > retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:id1, type:string, > comment:full_name1), FieldSchema(name:id2, type:string, comment:full_ > {quote} > *the rest of the content is lost*. > The content is not really lost, it's just can not display normal. Because > hive store the result in LazyStruct, and LazyStruct use '\t' as field > separator: > {code:java} > // LazyStruct.java#parse() > // Go through all bytes in the byte[] > while (fieldByteEnd <= structByteEnd) { > if (fieldByteEnd == structByteEnd || bytes[fieldByteEnd] == separator) { > // Reached the end of a field? > if (lastColumnTakesRest && fieldId == fields.length - 1) { > fieldByteEnd = structByteEnd; > } > startPosition[fieldId] = fieldByteBegin; > fieldId++; > if (fieldId == fields.length || fieldByteEnd == structByteEnd) { > // All fields have been parsed, or bytes have been parsed. > // We need to set the startPosition of fields.length to ensure we > // can use the same formula to calculate the length of each field. > // For missing fields, their starting positions will all be the > same, > // which will make their lengths to be -1 and uncheckedGetField will > // return these fields as NULLs. > for (int i = fieldId; i <= fields.length; i++) { > startPosition[i] = fieldByteEnd + 1; > } > break; > } > fieldByteBegin = fieldByteEnd + 1; > fieldByteEnd++; > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)