The package I am writing is for the Center for Tropical Forest Science, CTFS.  This 
"center" is a collaboration of 15+ institutions world wide that are investigating 
properties of tropical forest dynamics, species diversity, species distributions.  The 
investigation is composed of the same sampling design of the forest: a large 50 
hectare plot (usually) in which every tree >= 10 mm in diameter has been tagged, 
mapped and identified.  Reenumerations occur very 5 years (mostly).  Other information 
such as topography, canopy structure, seedling traps, etc, etc. are collected to 
different degrees at different sites.  Some sites have more than one plot, some have 
2, 25 ha plots, some have 52 hectare, species identification can easily take 10 years 
and counting, some sites have 1200 species, etc. etc.  It is a very large project with 
18 pantropical sites, over 3 million trees, 5000+ species with data ranging from first 
census to 7th census.

The people doing the field work and the analysis vary tremendously in their 
backgrounds and expertise.  There are many collaborators who have done little or none 
of the field work, some haven't even been to the sites.  The types of analyses that 
are done are wide ranging, but clearly cannot be crammed into a "standard" stats 
package.  The powers that be at CTFS decided to adopt R for analysis and away we went.

Four years have gone by and we have an odd collection of functions, few documented 
even inside the code (programmers hate to document), odd collection of manuals (mostly 
written by me) of varying complexity and integrity.  It became obvious to me this year 
that we needed to use more of R capacities and quit reinventing the packaging of 
functions and manuals for running them.  So I have taken on the responsibility of 
being the clearing house for all CTFS functions, checking them for usefulness, 
function and generality (which is not always necessary) and writing up help files.  
And now, thanks to you guys, I have managed to start the process of packaging it all 
together so we can all work from the same function resource base!

Now, how to fit our functions into \keywords{} and use \concepts{}.

1.  Many functions are CTFS data structure specific: 

mort.spp.habitat() which computes the mortality for a "population" of individuals that 
belong to a single species and occupy a "habitat" defined in previous analyses and 
mapped to locations in the plot.  

2.  Some functions that do most of the "work" are more generic:
mort.calc() which calculates the annual mortality rate of a population and provides 
confidence limits to the rate through other functions.

3.  Some are very generic and just make it simpler to interact with other CTFS 
programs that could probably be made more generic, but that hasn't happened yet or may 
take too long to run (we do a lot of randomization and generation of distributions for 
assigning probabilities for statistical results):
split.data() which takes a dataset of census information on trees that is a dataframe 
and makes a new dataset that is a list of dataframes, 1 dataframe for each species - 
just restructures the data for ease of use in other programs and for more rapid access 
of, in this case, species based information.

I believe I understand the \concepts{} section of Rd files... 

Knowing the audience for whom I am writing this package, I have provided a number of 
values for each function as concepts so that help.search() will dig up related and 
useful functions.  How the CTFS functions relate to each other is a very audience 
dependent phenomena.  I'll try out my ideas on the CTFS users and let them tell me 
whether they took a long time to find what they needed or not.

I now understand that \keywords{} are not in the usual sense and I have  viewed 
KEYWORDS.

For function #3 above, I would say this is a type of data manipulation and is used, 
within CTFS as a file utility, so is the keyword:

\keyword{Basics:manip, utilities}

For function #2 above, mortality rate is just a piece of arithmetic, the CI assigned 
as stats, but this isn't a survival analysis, its just defined computation suitable 
for the uses of the CTFS crowd .  so what keyword is appropriate?

Function #1 is a form of data manipulation too, but so are all of our R programs.

Now, I'm confused.  I agree that there is no reason to create a new keyword since the 
CTFS stuff is so specific, but should I just call nearly everything we write "misc"?

-ph

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to