[
https://issues.apache.org/jira/browse/SQOOP-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220290#comment-14220290
]
Veena Basavaraj edited comment on SQOOP-1771 at 11/21/14 4:59 PM:
------------------------------------------------------------------
postgres - sql:
http://www.postgresql.org/docs/9.1/static/arrays.html
To write an array value as a literal constant, enclose the element values
within curly braces and separate them by commas. (If you know C, this is not
unlike the C syntax for initializing structures.) You can put double quotes
around any element value, and must do so if it contains commas or curly braces.
(More details appear below.) Thus, the general format of an array constant is
the following:
{code}
'{ val1 delim val2 delim ... }'
{code}
where delim is the delimiter character for the type, as recorded in its pg_type
entry. Among the standard data types provided in the PostgreSQL distribution,
all use a comma (,), except for type box which uses a semicolon (;). Each val
is either a constant of the array element type, or a subarray. An example of an
array constant is:
{code}
'{{1,2,3},{4,5,6},{7,8,9}}'
{code}
Hive representation of data s pluggable with the type ser/deser to use. So
there is no one standard.
Avro is a common standard.
Details of JSON serde is here
http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/
Some references.
http://www.slideshare.net/zshao/hive-serde-and-lazyserde
https://github.com/rcongiu/Hive-JSON-Serde
The default ser/De which is no way easy to understand is here:)
https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java
https://github.com/apache/hive/blob/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java
Also asked Brock his thoughts on HIVE
{quote}
In any case the default file type is text, newline separated, fields by \001,
collection items (arrays) by \002 and map key values by \003
FIELDS TERMINATED BY '\001'
COLLECTION ITEMS TERMINATED BY '\002'
MAP KEYS TERMINATED BY '\003'
{quote}
was (Author: vybs):
postgres - sql:
http://www.postgresql.org/docs/9.1/static/arrays.html
To write an array value as a literal constant, enclose the element values
within curly braces and separate them by commas. (If you know C, this is not
unlike the C syntax for initializing structures.) You can put double quotes
around any element value, and must do so if it contains commas or curly braces.
(More details appear below.) Thus, the general format of an array constant is
the following:
{code}
'{ val1 delim val2 delim ... }'
{code}
where delim is the delimiter character for the type, as recorded in its pg_type
entry. Among the standard data types provided in the PostgreSQL distribution,
all use a comma (,), except for type box which uses a semicolon (;). Each val
is either a constant of the array element type, or a subarray. An example of an
array constant is:
{code}
'{{1,2,3},{4,5,6},{7,8,9}}'
{code}
Hive representation of data s pluggable with the type ser/deser to use. So
there is no one standard.
Avro is a common standard.
Details of JSON serde is here
http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/
Some references.
http://www.slideshare.net/zshao/hive-serde-and-lazyserde
https://github.com/rcongiu/Hive-JSON-Serde
The default ser/De which is no way easy to understand is here:)
https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java
https://github.com/apache/hive/blob/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java
> Investigation FORMAT of the Array/NestedArray/ Set/ Map in Postgres and HIVE.
> -----------------------------------------------------------------------------
>
> Key: SQOOP-1771
> URL: https://issues.apache.org/jira/browse/SQOOP-1771
> Project: Sqoop
> Issue Type: Sub-task
> Components: sqoop2-framework
> Reporter: Veena Basavaraj
> Fix For: 1.99.5
>
>
> update this wiki, which is missing details on the complex types
> https://cwiki.apache.org/confluence/display/SQOOP/Sqoop2+Intermediate+representation#Sqoop2Intermediaterepresentation-Intermediateformatrepresentationproposal
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)