[ 
https://issues.apache.org/jira/browse/SQOOP-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220290#comment-14220290
 ] 

Veena Basavaraj edited comment on SQOOP-1771 at 11/21/14 4:59 PM:
------------------------------------------------------------------

postgres - sql:

http://www.postgresql.org/docs/9.1/static/arrays.html
To write an array value as a literal constant, enclose the element values 
within curly braces and separate them by commas. (If you know C, this is not 
unlike the C syntax for initializing structures.) You can put double quotes 
around any element value, and must do so if it contains commas or curly braces. 
(More details appear below.) Thus, the general format of an array constant is 
the following:
{code}
'{ val1 delim val2 delim ... }'
{code}
where delim is the delimiter character for the type, as recorded in its pg_type 
entry. Among the standard data types provided in the PostgreSQL distribution, 
all use a comma (,), except for type box which uses a semicolon (;). Each val 
is either a constant of the array element type, or a subarray. An example of an 
array constant is:
{code}

'{{1,2,3},{4,5,6},{7,8,9}}'
{code}


Hive representation of data s pluggable with the type ser/deser to use. So 
there is no one standard.
Avro is a common standard.

Details of JSON serde is here

http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/
Some references.

http://www.slideshare.net/zshao/hive-serde-and-lazyserde

https://github.com/rcongiu/Hive-JSON-Serde

The default ser/De which is no way easy to understand is here:)

https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java

https://github.com/apache/hive/blob/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java

Also asked Brock his thoughts on HIVE

{quote}

 In any case the default file type is text, newline separated, fields by \001, 
collection items (arrays) by \002 and map key values by \003

   FIELDS TERMINATED BY '\001'
   COLLECTION ITEMS TERMINATED BY '\002'
   MAP KEYS TERMINATED BY '\003'

{quote}



was (Author: vybs):
postgres - sql:

http://www.postgresql.org/docs/9.1/static/arrays.html
To write an array value as a literal constant, enclose the element values 
within curly braces and separate them by commas. (If you know C, this is not 
unlike the C syntax for initializing structures.) You can put double quotes 
around any element value, and must do so if it contains commas or curly braces. 
(More details appear below.) Thus, the general format of an array constant is 
the following:
{code}
'{ val1 delim val2 delim ... }'
{code}
where delim is the delimiter character for the type, as recorded in its pg_type 
entry. Among the standard data types provided in the PostgreSQL distribution, 
all use a comma (,), except for type box which uses a semicolon (;). Each val 
is either a constant of the array element type, or a subarray. An example of an 
array constant is:
{code}

'{{1,2,3},{4,5,6},{7,8,9}}'
{code}


Hive representation of data s pluggable with the type ser/deser to use. So 
there is no one standard.
Avro is a common standard.

Details of JSON serde is here

http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/
Some references.

http://www.slideshare.net/zshao/hive-serde-and-lazyserde

https://github.com/rcongiu/Hive-JSON-Serde

The default ser/De which is no way easy to understand is here:)

https://github.com/apache/hive/blob/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinarySerDe.java

https://github.com/apache/hive/blob/trunk/serde/src/test/org/apache/hadoop/hive/serde2/lazybinary/TestLazyBinarySerDe.java



> Investigation FORMAT of the Array/NestedArray/ Set/ Map in Postgres and HIVE.
> -----------------------------------------------------------------------------
>
>                 Key: SQOOP-1771
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1771
>             Project: Sqoop
>          Issue Type: Sub-task
>          Components: sqoop2-framework
>            Reporter: Veena Basavaraj
>             Fix For: 1.99.5
>
>
> update this wiki, which is missing details on the complex types
> https://cwiki.apache.org/confluence/display/SQOOP/Sqoop2+Intermediate+representation#Sqoop2Intermediaterepresentation-Intermediateformatrepresentationproposal



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to