[jira] [Comment Edited] (PHOENIX-918) Support importing directly from ORC formatted HDFS data

James Taylor (JIRA) Wed, 09 Apr 2014 11:55:27 -0700

    [ 
https://issues.apache.org/jira/browse/PHOENIX-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964541#comment-13964541
 ]


James Taylor edited comment on PHOENIX-918 at 4/9/14 6:53 PM:
--------------------------------------------------------------

I believe there's an impedance mismatch between HCatalog and Phoenix metadata, 
but perhaps this has evolved last I checked? Phoenix needs to be able to do the 
following:
- Define composite primary key (this is one key to our perf and how we support 
secondary indexes)
- Define not null constraints on columns in the PK
- Define the full range of SQL types that we support
- Pass through key/value properties for various Phoenix-specific and HBase 
HTableDescriptor/HColumnDescriptor options

We have two import mechanisms already which could likely be leveraged here:
- Pig StoreFunc that enables importing data into HBase in a Phoenix-compliant 
manner
- a bulk map-reduce based CSV loader to produce Phoenix-compliant HFiles
I think it would be good to have something in the short term that makes it 
easy. Would there be someone in the Hive community that might be willing to 
take this on? I can help guide them as could others.


was (Author: jamestaylor):
I believe there's an impedance mismatch between HCatalog and Phoenix metadata, 
but perhaps this has evolved last I checked? Phoenix needs to be able to do the 
following:
- Define composite primary key (this is one key to our perf and how we support 
secondary indexes)
- Define not null constraints on columns in the PK
- Define the full range of SQL types that we support
- Pass through key/value properties for various Phoenix-specific and HBase 
HTableDescriptor/HColumnDescriptor options
We have two import mechanisms already which could likely be leveraged here:
- Pig StoreFunc that enables importing data into HBase in a Phoenix-compliant 
manner
- a bulk map-reduce based CSV loader to produce Phoenix-compliant HFiles
I think it would be good to have something in the short term that makes it 
easy. Would there be someone in the Hive community that might be willing to 
take this on? I can help guide them as could others.

> Support importing directly from ORC formatted HDFS data
> -------------------------------------------------------
>
>                 Key: PHOENIX-918
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-918
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>
> We currently have a good way to import from CSV, but we should also add the 
> ability to import from HDFS ORC files, as this would likely be common if 
> folks have Hive data they'd like to import.
> [~enis], [~ndimiduk], [~devaraj] - Does this make sense, or is there a 
> better, existing way? Any takers on implementing it?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (PHOENIX-918) Support importing directly from ORC formatted HDFS data

Reply via email to