Re: [R-pkg-devel] duplicate function during build

2016-07-27 Thread Enrico Schumann
On Sat, 23 Jul 2016, ProfJCNash  writes:

> Thanks Sven. That indeed works. And if anyone has ideas how it could be
> put into R so Windows users could benefit, I'm sure it would be useful
> in checks of packages.

You could use R functionality to rewrite the shell
commands. Perhaps along those lines:

--8<---cut here---start->8---
fun_names <- function(dir,
  duplicates_only = TRUE,
  file_pattern = "[.][rR]$",
  fun_pattern = " *([^\\s]+) *<- *function.*") {

files <- dir(dir, pattern = file_pattern, full.names = TRUE)
ans <- data.frame(fun = character(0),
  file = character(0))
for (f in files) {
txt <- readLines(f)
fun.lines <- grepl(fun_pattern, txt)

if (any(fun.lines)) {
ans <- rbind(ans,
 data.frame(fun = gsub(fun_pattern, "\\1",
   txt[fun.lines],
   perl = TRUE),
file = f,
line = which(fun.lines),
stringsAsFactors = FALSE))
}
}

ans <- ans[order(ans[["fun"]]), ]

if (duplicates_only) {
d <- duplicated(ans[["fun"]])
d0 <- match(unique(ans[["fun"]][d]), ans[["fun"]])
ans <- ans[sort(c(d0, which(d))),]
}

ans
}
--8<---cut here---end--->8---

One would call then function on a directory.

For instance,

  fun_names("~/Packages/NMOF/R")

gives me output

 funfile line
10  cfHeston   /home/es/Packages/NMOF/R/callCF.R   41
18  cfHeston /home/es/Packages/NMOF/R/callHestoncf.R   29

## [...] 

But it will be tricky to catch only such re-definitions
of functions that have been left in the files by
mistake. For instance, I often define short helper
functions within other functions, and such helper
functions might then get flagged, too.

Kind regards
Enrico



> In other investigations of this, I realized that install.R has to
> prepare the .rdb and .rdx files and at that stage duplication might be
> detected. If install.R puts both versions of a duplicated name into
> these files, then the lazy load of library() or require() could be a
> place where detection would be useful, though only one of the names gets
> actually made available for use. However, my expertise with this
> internal aspect of R is rather weak.
>
> Cheers, JN
>
> On 16-07-23 12:04 PM, Sven E. Templer wrote:
>> Despite it might help, learning/using git is not tackling this specific 
>> problem, I suggest code that does:
>> 
>> sed -e 's/^[\ \t]*//' -e 's/#.*//' R/* | awk '/function/{print $1}' | sort | 
>> uniq -d
>> 
>> or
>> 
>> https://gist.github.com/setempler/7fcf2a3a737ce1293e0623d2bb8e08ed
>> (any comments welcome)
>> 
>> If one knows coding R, it might be more productive developing a tiny tool 
>> for that, instead of learning a new (and complex) one (as git).
>> 
>> Nevertheless, git is great!
>> 
>> Best wishes,
>> 
>> Sven
>> 
>> ---
>> 
>> web: www.templer.se
>> twitter: @setempler
>>> On 23 Jul 2016, at 16:17, Hadley Wickham  wrote:
>>>
>>> I think this sort of meta problem is best solved with svn/git because you
>>> can easily see if the changes you think you made align with the changes you
>>> actually made. Learning svn or git is a lot of work, but the payoff is
>>> worth it.
>>>
>>> Hadley
>>>
>>> On Friday, July 22, 2016, ProfJCNash  wrote:
>>>
 In trying to rationalize some files in a package I'm working on, I
 copied a function from one file to another, but forgot to change the
 name of one of them. It turns out the name of the file containing the
 "old" function was later in collation sequence than the one I was
 planning to be the "new" one. To debug some issues, I put some print()
 and cat() statements in the "new" file, but after building the package,
 they weren't there. Turns out the "old" function got installed, as might
 be expected if files processed in order. Debugging this took about 2
 hours of slightly weird effort with 2 machines and 3 OS distributions
 before I realized the problem. It's fairly obvious that I should expect
 issues in this case, but not so clear how to detect the source of the
 problem.

 Question: Has anyone created a script to catch such duplicate functions
 from different files during build? I think a warning message that there
 are duplicate functions could save some time and effort. Maybe it's
 already there, but I saw no obvious message. In this case, I'm only
 working in R.

 I've found build.R in the R tarball, which is where I suspect such a
 check should go, and I'm willing to prepare a patch when I figure out
 how this should be done. However, it seems worth askin

Re: [R-pkg-devel] duplicate function during build

2016-07-23 Thread Roy Mendelssohn - NOAA Federal
I don't know if ctags works with R files,  but ctags does a similar thing as 
you are asking for other languages,  and can be integrated into git using 
hooks, as in:

https://robots.thoughtbot.com/use-git-hooks-to-automate-annoying-tasks

Don't know if this helps,  but thought I would pass it along.

-Roy

> On Jul 23, 2016, at 10:20 AM, ProfJCNash  wrote:
> 
> Thanks Sven. That indeed works. And if anyone has ideas how it could be
> put into R so Windows users could benefit, I'm sure it would be useful
> in checks of packages.
> 
> In other investigations of this, I realized that install.R has to
> prepare the .rdb and .rdx files and at that stage duplication might be
> detected. If install.R puts both versions of a duplicated name into
> these files, then the lazy load of library() or require() could be a
> place where detection would be useful, though only one of the names gets
> actually made available for use. However, my expertise with this
> internal aspect of R is rather weak.
> 
> Cheers, JN
> 
> On 16-07-23 12:04 PM, Sven E. Templer wrote:
>> Despite it might help, learning/using git is not tackling this specific 
>> problem, I suggest code that does:
>> 
>> sed -e 's/^[\ \t]*//' -e 's/#.*//' R/* | awk '/function/{print $1}' | sort | 
>> uniq -d
>> 
>> or
>> 
>> https://gist.github.com/setempler/7fcf2a3a737ce1293e0623d2bb8e08ed
>> (any comments welcome)
>> 
>> If one knows coding R, it might be more productive developing a tiny tool 
>> for that, instead of learning a new (and complex) one (as git).
>> 
>> Nevertheless, git is great!
>> 
>> Best wishes,
>> 
>> Sven
>> 
>> ---
>> 
>> web: www.templer.se
>> twitter: @setempler
>>> On 23 Jul 2016, at 16:17, Hadley Wickham  wrote:
>>> 
>>> I think this sort of meta problem is best solved with svn/git because you
>>> can easily see if the changes you think you made align with the changes you
>>> actually made. Learning svn or git is a lot of work, but the payoff is
>>> worth it.
>>> 
>>> Hadley
>>> 
>>> On Friday, July 22, 2016, ProfJCNash  wrote:
>>> 
 In trying to rationalize some files in a package I'm working on, I
 copied a function from one file to another, but forgot to change the
 name of one of them. It turns out the name of the file containing the
 "old" function was later in collation sequence than the one I was
 planning to be the "new" one. To debug some issues, I put some print()
 and cat() statements in the "new" file, but after building the package,
 they weren't there. Turns out the "old" function got installed, as might
 be expected if files processed in order. Debugging this took about 2
 hours of slightly weird effort with 2 machines and 3 OS distributions
 before I realized the problem. It's fairly obvious that I should expect
 issues in this case, but not so clear how to detect the source of the
 problem.
 
 Question: Has anyone created a script to catch such duplicate functions
 from different files during build? I think a warning message that there
 are duplicate functions could save some time and effort. Maybe it's
 already there, but I saw no obvious message. In this case, I'm only
 working in R.
 
 I've found build.R in the R tarball, which is where I suspect such a
 check should go, and I'm willing to prepare a patch when I figure out
 how this should be done. However, it seems worth asking if anyone has
 needed to do this before. I've already done some searching, but the
 results seem to pick up quite different posts than I need.
 
 Cheers, JN
 
 __
 R-package-devel@r-project.org  mailing list
 https://stat.ethz.ch/mailman/listinfo/r-package-devel
 
>>> 
>>> 
>>> -- 
>>> http://hadley.nz
>>> 
>>> [[alternative HTML version deleted]]
>>> 
>>> __
>>> R-package-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>> 
>> 
>> 
>> 
>> 
> 
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

**
"The contents of this message do not reflect any position of the U.S. 
Government or NOAA."
**
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new address and phone***
110 Shaffer Road
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: roy.mendelss...@noaa.gov www: http://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected" 
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] duplicate function during build

2016-07-23 Thread ProfJCNash
Thanks Sven. That indeed works. And if anyone has ideas how it could be
put into R so Windows users could benefit, I'm sure it would be useful
in checks of packages.

In other investigations of this, I realized that install.R has to
prepare the .rdb and .rdx files and at that stage duplication might be
detected. If install.R puts both versions of a duplicated name into
these files, then the lazy load of library() or require() could be a
place where detection would be useful, though only one of the names gets
actually made available for use. However, my expertise with this
internal aspect of R is rather weak.

Cheers, JN

On 16-07-23 12:04 PM, Sven E. Templer wrote:
> Despite it might help, learning/using git is not tackling this specific 
> problem, I suggest code that does:
> 
> sed -e 's/^[\ \t]*//' -e 's/#.*//' R/* | awk '/function/{print $1}' | sort | 
> uniq -d
> 
> or
> 
> https://gist.github.com/setempler/7fcf2a3a737ce1293e0623d2bb8e08ed
> (any comments welcome)
> 
> If one knows coding R, it might be more productive developing a tiny tool for 
> that, instead of learning a new (and complex) one (as git).
> 
> Nevertheless, git is great!
> 
> Best wishes,
> 
> Sven
> 
> ---
> 
> web: www.templer.se
> twitter: @setempler
>> On 23 Jul 2016, at 16:17, Hadley Wickham  wrote:
>>
>> I think this sort of meta problem is best solved with svn/git because you
>> can easily see if the changes you think you made align with the changes you
>> actually made. Learning svn or git is a lot of work, but the payoff is
>> worth it.
>>
>> Hadley
>>
>> On Friday, July 22, 2016, ProfJCNash  wrote:
>>
>>> In trying to rationalize some files in a package I'm working on, I
>>> copied a function from one file to another, but forgot to change the
>>> name of one of them. It turns out the name of the file containing the
>>> "old" function was later in collation sequence than the one I was
>>> planning to be the "new" one. To debug some issues, I put some print()
>>> and cat() statements in the "new" file, but after building the package,
>>> they weren't there. Turns out the "old" function got installed, as might
>>> be expected if files processed in order. Debugging this took about 2
>>> hours of slightly weird effort with 2 machines and 3 OS distributions
>>> before I realized the problem. It's fairly obvious that I should expect
>>> issues in this case, but not so clear how to detect the source of the
>>> problem.
>>>
>>> Question: Has anyone created a script to catch such duplicate functions
>>> from different files during build? I think a warning message that there
>>> are duplicate functions could save some time and effort. Maybe it's
>>> already there, but I saw no obvious message. In this case, I'm only
>>> working in R.
>>>
>>> I've found build.R in the R tarball, which is where I suspect such a
>>> check should go, and I'm willing to prepare a patch when I figure out
>>> how this should be done. However, it seems worth asking if anyone has
>>> needed to do this before. I've already done some searching, but the
>>> results seem to pick up quite different posts than I need.
>>>
>>> Cheers, JN
>>>
>>> __
>>> R-package-devel@r-project.org  mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>>>
>>
>>
>> -- 
>> http://hadley.nz
>>
>>  [[alternative HTML version deleted]]
>>
>> __
>> R-package-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> 
> 
> 
> 
>

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] duplicate function during build

2016-07-23 Thread Paul Gilbert

Hadley

My initial reflex reaction was svn/git too, but then I could not see how 
to use either to identify the problem John had. If you have a good 
svn/git command for identifying duplicate functions could you please 
post it, I am curious. (BTW, John does use svn, and possibly git too.)


Thanks,
Paul

On 07/23/2016 10:17 AM, Hadley Wickham wrote:

I think this sort of meta problem is best solved with svn/git because you
can easily see if the changes you think you made align with the changes you
actually made. Learning svn or git is a lot of work, but the payoff is
worth it.

Hadley

On Friday, July 22, 2016, ProfJCNash  wrote:


In trying to rationalize some files in a package I'm working on, I
copied a function from one file to another, but forgot to change the
name of one of them. It turns out the name of the file containing the
"old" function was later in collation sequence than the one I was
planning to be the "new" one. To debug some issues, I put some print()
and cat() statements in the "new" file, but after building the package,
they weren't there. Turns out the "old" function got installed, as might
be expected if files processed in order. Debugging this took about 2
hours of slightly weird effort with 2 machines and 3 OS distributions
before I realized the problem. It's fairly obvious that I should expect
issues in this case, but not so clear how to detect the source of the
problem.

Question: Has anyone created a script to catch such duplicate functions
from different files during build? I think a warning message that there
are duplicate functions could save some time and effort. Maybe it's
already there, but I saw no obvious message. In this case, I'm only
working in R.

I've found build.R in the R tarball, which is where I suspect such a
check should go, and I'm willing to prepare a patch when I figure out
how this should be done. However, it seems worth asking if anyone has
needed to do this before. I've already done some searching, but the
results seem to pick up quite different posts than I need.

Cheers, JN

__
R-package-devel@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel






__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] duplicate function during build

2016-07-23 Thread Sven E. Templer
Despite it might help, learning/using git is not tackling this specific 
problem, I suggest code that does:

sed -e 's/^[\ \t]*//' -e 's/#.*//' R/* | awk '/function/{print $1}' | sort | 
uniq -d

or

https://gist.github.com/setempler/7fcf2a3a737ce1293e0623d2bb8e08ed
(any comments welcome)

If one knows coding R, it might be more productive developing a tiny tool for 
that, instead of learning a new (and complex) one (as git).

Nevertheless, git is great!

Best wishes,

Sven

---

web: www.templer.se
twitter: @setempler
> On 23 Jul 2016, at 16:17, Hadley Wickham  wrote:
> 
> I think this sort of meta problem is best solved with svn/git because you
> can easily see if the changes you think you made align with the changes you
> actually made. Learning svn or git is a lot of work, but the payoff is
> worth it.
> 
> Hadley
> 
> On Friday, July 22, 2016, ProfJCNash  wrote:
> 
>> In trying to rationalize some files in a package I'm working on, I
>> copied a function from one file to another, but forgot to change the
>> name of one of them. It turns out the name of the file containing the
>> "old" function was later in collation sequence than the one I was
>> planning to be the "new" one. To debug some issues, I put some print()
>> and cat() statements in the "new" file, but after building the package,
>> they weren't there. Turns out the "old" function got installed, as might
>> be expected if files processed in order. Debugging this took about 2
>> hours of slightly weird effort with 2 machines and 3 OS distributions
>> before I realized the problem. It's fairly obvious that I should expect
>> issues in this case, but not so clear how to detect the source of the
>> problem.
>> 
>> Question: Has anyone created a script to catch such duplicate functions
>> from different files during build? I think a warning message that there
>> are duplicate functions could save some time and effort. Maybe it's
>> already there, but I saw no obvious message. In this case, I'm only
>> working in R.
>> 
>> I've found build.R in the R tarball, which is where I suspect such a
>> check should go, and I'm willing to prepare a patch when I figure out
>> how this should be done. However, it seems worth asking if anyone has
>> needed to do this before. I've already done some searching, but the
>> results seem to pick up quite different posts than I need.
>> 
>> Cheers, JN
>> 
>> __
>> R-package-devel@r-project.org  mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>> 
> 
> 
> -- 
> http://hadley.nz
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] duplicate function during build

2016-07-23 Thread Hadley Wickham
I think this sort of meta problem is best solved with svn/git because you
can easily see if the changes you think you made align with the changes you
actually made. Learning svn or git is a lot of work, but the payoff is
worth it.

Hadley

On Friday, July 22, 2016, ProfJCNash  wrote:

> In trying to rationalize some files in a package I'm working on, I
> copied a function from one file to another, but forgot to change the
> name of one of them. It turns out the name of the file containing the
> "old" function was later in collation sequence than the one I was
> planning to be the "new" one. To debug some issues, I put some print()
> and cat() statements in the "new" file, but after building the package,
> they weren't there. Turns out the "old" function got installed, as might
> be expected if files processed in order. Debugging this took about 2
> hours of slightly weird effort with 2 machines and 3 OS distributions
> before I realized the problem. It's fairly obvious that I should expect
> issues in this case, but not so clear how to detect the source of the
> problem.
>
> Question: Has anyone created a script to catch such duplicate functions
> from different files during build? I think a warning message that there
> are duplicate functions could save some time and effort. Maybe it's
> already there, but I saw no obvious message. In this case, I'm only
> working in R.
>
> I've found build.R in the R tarball, which is where I suspect such a
> check should go, and I'm willing to prepare a patch when I figure out
> how this should be done. However, it seems worth asking if anyone has
> needed to do this before. I've already done some searching, but the
> results seem to pick up quite different posts than I need.
>
> Cheers, JN
>
> __
> R-package-devel@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>


-- 
http://hadley.nz

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] duplicate function during build

2016-07-22 Thread Sven E. Templer
Not during build, but before, you could run in a bash from the package source 
root:

$ awk '/function/{print $1}' R/* | uniq -d

To find the files, use:

$ grep  R/*

Best wishes,

Sven

> On 23 Jul 2016, at 05:01, ProfJCNash  wrote:
> 
> In trying to rationalize some files in a package I'm working on, I
> copied a function from one file to another, but forgot to change the
> name of one of them. It turns out the name of the file containing the
> "old" function was later in collation sequence than the one I was
> planning to be the "new" one. To debug some issues, I put some print()
> and cat() statements in the "new" file, but after building the package,
> they weren't there. Turns out the "old" function got installed, as might
> be expected if files processed in order. Debugging this took about 2
> hours of slightly weird effort with 2 machines and 3 OS distributions
> before I realized the problem. It's fairly obvious that I should expect
> issues in this case, but not so clear how to detect the source of the
> problem.
> 
> Question: Has anyone created a script to catch such duplicate functions
> from different files during build? I think a warning message that there
> are duplicate functions could save some time and effort. Maybe it's
> already there, but I saw no obvious message. In this case, I'm only
> working in R.
> 
> I've found build.R in the R tarball, which is where I suspect such a
> check should go, and I'm willing to prepare a patch when I figure out
> how this should be done. However, it seems worth asking if anyone has
> needed to do this before. I've already done some searching, but the
> results seem to pick up quite different posts than I need.
> 
> Cheers, JN
> 
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel