[ 
https://issues.apache.org/jira/browse/HDFS-4712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643010#comment-13643010
 ] 

Colin Patrick McCabe commented on HDFS-4712:
--------------------------------------------

Hi Andrea,

Thanks for posting the code.  It would be better if you posted it in "diff" 
format.

See the section here on "generating a patch" for one possible way to do it.  
http://wiki.apache.org/hadoop/HowToContribute
(If you're familiar with the diff tool, you can also use that tool to generate 
patches.)

I see that you used 8-space indents in a few cases.  Try to match with the 
surrounding code instead, so that it can be consistent.

Let's create an enum called {{dataNodeReportType}} (or something similar) with 
ALL, DEAD, or LIVE as members.  That way we don't have to mess around with 
strings.  We can always add more enum members later if we need to without 
breaking the existing ABI.

{code}
    typedef struct  {
        long capacity;        /* The raw capacity */
        long dfsUsed;         /* The used space by the data node*/
        ...
    } hdfsDataNodeInfo;
{code}

This makes it impossible to do forward declarations.  I recommend this instead:
{code}
    typedef struct hdfsDataNodeInfo_s {
        long capacity;        /* The raw capacity */
        long dfsUsed;         /* The used space by the data node*/
        ...
    } hdfsDataNodeInfo;
{code}

{code}
    hdfsDataNodeInfo *hdfsGetDataNodeInfo(hdfsFS fs, const char * 
dataNodeType,int* numEntries);
    void hdfsFreeDataNodeInfo(hdfsDataNodeInfo *hdfsDataNodeInfos, int 
numEntries);
{code}

I think the proposed API has some problems.  You are assuming that the client 
knows the exact size of {{struct hdfsDataNodeInfo}}.  This means that we cannot 
add fields to the end in the future, if the Java API gains some fields.

It would be better to have something like this:
{code}
    enum dataNodeReportType {
        DN_REPORT_ALL = 0,
        DN_REPORT_DEAD,
        DN_REPORT_LIVE
    };
    hdfsDataNodeInfo **hdfsGetDataNodeInfo(hdfsFS fs, enum dataNodeReportType 
dataNodeType,int* numEntries);
    void hdfsFreeDataNodeInfo(hdfsDataNodeInfo **hdfsDataNodeInfos);
{code}

In this case, the library would allocate an array of pointers to 
{{hdfsDataNodeInfo*}}, and then do a separate allocation for each 
{{hdfsDataNodeInfo*}} structure.  That way, if we later need to add a foo and a 
bar field to the end of {{struct hdfsDataNodeInfo}}, we can do that without 
breaking existing code that uses the library.

It is customary to put a NULL pointer at the end of the array, so that 
{{hdfsFreeDataNodeInfo}} can just check for that rather than having to be told 
how long the array is.
                
> New libhdfs method hdfsGetDataNodes
> -----------------------------------
>
>                 Key: HDFS-4712
>                 URL: https://issues.apache.org/jira/browse/HDFS-4712
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: libhdfs
>            Reporter: andrea manzi
>         Attachments: hdfs.c.diff, hdfs.h.diff
>
>
> we have implemented a possible extension to libhdfs to retrieve information 
> about the available datanodes ( there was a mail on the hadoop-hdsf-dev 
> mailing list initially abut this :
> http://mail-archives.apache.org/mod_mbox/hadoop-hdfs-dev/201204.mbox/%3CCANhO-
> s0mvororrxpjnjbql6brkj4c7l+u816xkdc+2r0whj...@mail.gmail.com%3E)
> i would like to know how to proceed to create a patch, cause on the wiki 
> http://wiki.apache.org/hadoop/HowToContribute i can see info about JAVA 
> patches but nothing related to extensions in C.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to