[ 
https://issues.apache.org/jira/browse/PIG-3901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip (flip) Kromer updated PIG-3901:
--------------------------------------

    Attachment: 0002-Fully-commented-the-pig.properties-file.patch
                0001-Organized-pig-properties-file-to-group-properties-in.patch

The first patch simply rearranges the existing property file into sensible 
groups; if you do `sort conf/pig.properties > /tmp/p1 ; sort 
conf/pig-old.properties > /tmp/p2 ; diff -uw /tmp/{p1,p2}` you will see no 
differences. The second patch cleans up the formatting and adds complete 
comments for each property that explain what the feature does; what its default 
and other allowed values are; why a user might change it from the default; and 
what might go wrong if they do.

> Organize the Pig properties file and document all properties
> ------------------------------------------------------------
>
>                 Key: PIG-3901
>                 URL: https://issues.apache.org/jira/browse/PIG-3901
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Philip (flip) Kromer
>            Assignee: Philip (flip) Kromer
>            Priority: Minor
>              Labels: conf, config, documentation, properties, settings
>         Attachments: 
> 0001-Organized-pig-properties-file-to-group-properties-in.patch, 
> 0002-Fully-commented-the-pig.properties-file.patch, 
> organize_pig_properties.patch
>
>
> The current pig.properties file can use some love. Each property should be 
> introduced by a documentation string explaining
> * what the feature does,
> * what its default and other allowed values are,
> * why a user might change it from the default,
> * and what might go wrong with each.
> The documentation should follow a common format -- I propose the following 
> guidelines:
> * Each property should supply either a bulleted list of acceptable values, 
> indicating the default; or provide the default value inline with the 
> description
> * Don't say 'This setting lets you control whether Pig will decide to use the 
> Hemiconducer feature', say 'Enables the hemiconducer feature, which [...]'
> * Don't document the internals of the feature. Describe its impact on job 
> execution or performance.
> * Use consistent indentation, title formatting, and block delimiting. (The 
> current patch does not yet do so completely, as I'm figuring it out)
> * Place each setting in the appropriate block according to its impact on the 
> user experience.
> * Call out Experimental features with `EXPERIMENTAL`, but group them with 
> similar settings.
> * If a setting is dangerous, call that out with `WARNING`
> * If one value is always appropriate for casual use, or always appropriate 
> for production use, we should call that out. Production use should assume a 
> moderately loaded single rack hadoop cluster according to the major distro's 
> reference configuration -- people running massive-scale installations don't 
> need this file's advice.
> I've attached a patch that organizes the current properties file and 
> documents everything I felt confident describing. This is a preliminary 
> patch, as I'll need some help documenting many of the currently un-documented 
> ones. Please review what I've written carefully; I have reasonable experience 
> programming Pig but limited familiarity with the experimental features.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to