[ https://issues.apache.org/jira/browse/HAWQ-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Radar Lei reassigned HAWQ-1270: ------------------------------- Assignee: Yi Jin (was: Radar Lei) > Plugged storage back-ends for HAWQ > ---------------------------------- > > Key: HAWQ-1270 > URL: https://issues.apache.org/jira/browse/HAWQ-1270 > Project: Apache HAWQ > Issue Type: Improvement > Reporter: Dmitry Buzolin > Assignee: Yi Jin > > Since HAWQ only depends on Hadoop and Parquet for columnar format support, I > would like to propose pluggable storage backend design for Hawq. Hadoop is > already supported but there is Ceph - a distributed, storage system which > offers standard Posix compliant file system, object and a block storage. Ceph > is also data location aware, written in C++. and is more sophisticated > storage backend compare to Hadoop at this time. It provides replicated and > erasure encoded storage pools, Other great features of Ceph are: snapshots > and an algorithmic approach to map data to the nodes rather than having > centrally managed namenodes. I don't think HDFS offers any of these features. > In terms of performance, Ceph should be faster than HFDS since it is written > on C++ and because it doesn't have scalability limitations when mapping data > to storage pools, compare to Hadoop, where name node is such point of > contention. -- This message was sent by Atlassian JIRA (v6.4.14#64029)