[jira] [Commented] (HAWQ-1270) Plugged storage back-ends for HAWQ

Kyle R Dunn (JIRA) Wed, 08 Mar 2017 09:40:03 -0800

    [ 
https://issues.apache.org/jira/browse/HAWQ-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15901627#comment-15901627
 ]


Kyle R Dunn commented on HAWQ-1270:
-----------------------------------

>From what I can tell, [this | 
>https://github.com/apache/incubator-hawq/blob/master/src/bin/gpfilesystem/hdfs/gpfshdfs.c]
> IS the interface.

When you look at the {{pg_filesystems}} table, it lists the exact functions 
requires for a new backend:
{code}
SELECT * from pg_filesystem ;
-[ RECORD 1 ]------+--------------------------
fsysname           | hdfs
fsysconnfn         | gpfs_hdfs_connect
fsysdisconnfn      | gpfs_hdfs_disconnect
fsysopenfn         | gpfs_hdfs_openfile
fsysclosefn        | gpfs_hdfs_closefile
fsysseekfn         | gpfs_hdfs_seek
fsystellfn         | gpfs_hdfs_tell
fsysreadfn         | gpfs_hdfs_read
fsyswritefn        | gpfs_hdfs_write
fsysflushfn        | gpfs_hdfs_sync
fsysdeletefn       | gpfs_hdfs_delete
fsyschmodfn        | gpfs_hdfs_chmod
fsysmkdirfn        | gpfs_hdfs_createdirectory
fsystruncatefn     | gpfs_hdfs_truncate
fsysgetpathinfofn  | gpfs_hdfs_getpathinfo
fsysfreefileinfofn | gpfs_hdfs_freefileinfo
fsyslibfile        | $libdir/gpfshdfs.so
fsysowner          | 10
fsystrusted        | f
fsysacl            |
{code}

> Plugged storage back-ends for HAWQ
> ----------------------------------
>
>                 Key: HAWQ-1270
>                 URL: https://issues.apache.org/jira/browse/HAWQ-1270
>             Project: Apache HAWQ
>          Issue Type: Improvement
>            Reporter: Dmitry Buzolin
>            Assignee: Ed Espino
>
> Since HAWQ only depends on Hadoop and Parquet for columnar format support, I 
> would like to propose pluggable storage backend design for Hawq. Hadoop is 
> already supported but there is Ceph -  a distributed, storage system which 
> offers standard Posix compliant file system, object and a block storage. Ceph 
> is also data location aware, written in C++. and is more sophisticated 
> storage backend compare to Hadoop at this time. It provides replicated and 
> erasure encoded storage pools, Other great features of Ceph are: snapshots 
> and an algorithmic approach to map data to the nodes rather than having 
> centrally managed namenodes. I don't think HDFS offers any of these features. 
> In terms of performance, Ceph should be faster than HFDS since it is written 
> on C++ and because it doesn't have scalability limitations when mapping data 
> to storage pools, compare to Hadoop, where name node is such point of 
> contention.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (HAWQ-1270) Plugged storage back-ends for HAWQ

Reply via email to