[
https://issues.apache.org/jira/browse/PHOENIX-918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13964541#comment-13964541
]
James Taylor edited comment on PHOENIX-918 at 4/9/14 6:53 PM:
--------------------------------------------------------------
I believe there's an impedance mismatch between HCatalog and Phoenix metadata,
but perhaps this has evolved last I checked? Phoenix needs to be able to do the
following:
- Define composite primary key (this is one key to our perf and how we support
secondary indexes)
- Define not null constraints on columns in the PK
- Define the full range of SQL types that we support
- Pass through key/value properties for various Phoenix-specific and HBase
HTableDescriptor/HColumnDescriptor options
We have two import mechanisms already which could likely be leveraged here:
- Pig StoreFunc that enables importing data into HBase in a Phoenix-compliant
manner
- a bulk map-reduce based CSV loader to produce Phoenix-compliant HFiles
I think it would be good to have something in the short term that makes it
easy. Would there be someone in the Hive community that might be willing to
take this on? I can help guide them as could others.
was (Author: jamestaylor):
I believe there's an impedance mismatch between HCatalog and Phoenix metadata,
but perhaps this has evolved last I checked? Phoenix needs to be able to do the
following:
- Define composite primary key (this is one key to our perf and how we support
secondary indexes)
- Define not null constraints on columns in the PK
- Define the full range of SQL types that we support
- Pass through key/value properties for various Phoenix-specific and HBase
HTableDescriptor/HColumnDescriptor options
We have two import mechanisms already which could likely be leveraged here:
- Pig StoreFunc that enables importing data into HBase in a Phoenix-compliant
manner
- a bulk map-reduce based CSV loader to produce Phoenix-compliant HFiles
I think it would be good to have something in the short term that makes it
easy. Would there be someone in the Hive community that might be willing to
take this on? I can help guide them as could others.
> Support importing directly from ORC formatted HDFS data
> -------------------------------------------------------
>
> Key: PHOENIX-918
> URL: https://issues.apache.org/jira/browse/PHOENIX-918
> Project: Phoenix
> Issue Type: Bug
> Reporter: James Taylor
>
> We currently have a good way to import from CSV, but we should also add the
> ability to import from HDFS ORC files, as this would likely be common if
> folks have Hive data they'd like to import.
> [~enis], [~ndimiduk], [~devaraj] - Does this make sense, or is there a
> better, existing way? Any takers on implementing it?
--
This message was sent by Atlassian JIRA
(v6.2#6252)