Re: [Rd] Please add me to bugzilla

2017-03-06 Thread Martin Maechler
> Bradley Broom 
> on Mon, 6 Mar 2017 06:55:35 -0600 writes:

> Apologies, I thought I was following exactly that sentence
> and trying to make a minimal post that would waste as
> little developer bandwidth as possible given the lack of a
> better system.

I understand.   My apologies now, as I was mistrusting, clearly
wrongly in this case.

> Anyway, I have been using R for like forever (20 years).

> In my current project, I have run into problems with stack
> overflows in R's dendrogram code when trying to use either
> str() or as.hclust() on very deep dendrograms.

I understand.  Indeed, bug PR#16424 
https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16424
encountered the same problem in other dendrogram functions and
solved it by re-programming the relevant parts non-recursively,
too.

   [.]  

> What should happen: Function completes without a stack
> overflow.

> 2nd bug: hh <- as.hclust(de)

> What happens: Error: C stack usage 7971248 is too close to the limit

> What should happen: Function completes without a stack
> overflow.

> A knowledgeable user might be able to increase R's limits
> to avoid these errors on this particular dendrogram, but
> a) my users aren't that knowledgeable about R and this is
> expected to be a common problem, and b) there will be
> bigger dendrograms (up to at least 25000 leaves).

Agreed.  The current help pages warns about the problem and
gives advice (related to increasing the stack), but what you propose
is better, i.e., re-implementing relevant parts non-recursively.

> Please see attached patch for non-recursive
> implementations.

Very well done, thank you a lot!
[and I will add you to bugzilla .. so you can use it for the
 next bug .. ;-)]

Best,
Martin

> Regards, Bradley



> On Mon, Mar 6, 2017 at 3:50 AM, Martin Maechler
>  wrote:

>> > Bradley Broom  > on Sun, 5
>> Mar 2017 16:03:30 -0600 writes:
>> 
>> > Please add me to R bugzilla.  Thanks, Bradley
>> 
>> Well, I will not do it just like that (mean "after such a
>> minimal message").
>> 
>> I don't see any evidence as to your credentials,
>> knowledge of R, etc, as part of this request.  We are all
>> professionals, devoting part of our (work and free) time
>> to the R project (rather than employees of the company
>> you paid to serve you ...)
>> 
>> It may be that you have read
>> https://www.r-project.org/bugs.html
>> 
>> Notably this part
>> 
--> NOTE: due to abuse by spammers, since 2016-07-09 only
--> users who have
>> previously submitted bugs can submit new ones on R’s
>> Bugzilla. We’re working on a better system… In the mean
>> time, post (e-mail) to R-devel or ask an R Core member to
>> add you manually to R’s Bugzilla members.
>> 
>> The last sentence was *meant* to say you should post
>> (possibly parts, ideally a minimal reproducible example
>> of) your bug report to R-devel so others could comment on
>> it, agree or disagree with your assessment etc, __or__
>> ask an R-core member to add you to bugzilla (if you
>> really read the other parts of the 'R bugs' web page
>> above).
>> 
>> Posting to all 1000 R-devel readers with no content about
>> what you consider a bug is a waste of bandwidth for at
>> least 99% of these readers.
>> 
>> [Yes, I'm also using their time ... in the hope to
>> *improve* the quality of future such postings].
>> 
>> Martin Maechler ETH Zurich
>> 
> x external: dendro-non-recursive.patch text/x-patch, u
> [Click mouse-2 to display text]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Please add me to bugzilla

2017-03-06 Thread Bradley Broom
Apologies, I thought I was following exactly that sentence and trying to
make a minimal post that would waste as little developer bandwidth as
possible given the lack of a better system.

Anyway, I have been using R for like forever (20 years).

In my current project, I have run into problems with stack overflows in R's
dendrogram code when trying to use either str() or as.hclust() on very deep
dendrograms.

To duplicate, use this function from tests/reg-tests-1c.R in the R source
code:
mkDend <- function(n, lab, method = "complete",
   ## gives *ties* often:
   rGen = function(n) 1+round(16*abs(rnorm(n {
stopifnot(is.numeric(n), length(n) == 1, n >= 1, is.character(lab))
a <- matrix(rGen(n*n), n, n)
colnames(a) <- rownames(a) <- paste0(lab, 1:n)
.HC. <<- hclust(as.dist(a + t(a)), method=method)
as.dendrogram(.HC.)
}

Get a nasty dendrogram:

de <- mkDend(2000, 'x', 'single')

1st bug:
sink('somefile.txt'); str(de); sink();

What happens:
Error in getOption("OutDec") : node stack overflow

Also, the last call to sink() isn't executed because of the error, so
you'll need to call sink() after the error to clear the diversion.

What should happen:
Function completes without a stack overflow.

2nd bug:
hh <- as.hclust(de)

What happens:
Error: C stack usage  7971248 is too close to the limit

What should happen:
Function completes without a stack overflow.

A knowledgeable user might be able to increase R's limits to avoid these
errors on this particular dendrogram, but a) my users aren't that
knowledgeable about R and this is expected to be a common problem, and b)
there will be bigger dendrograms (up to at least 25000 leaves).

Please see attached patch for non-recursive implementations.

Regards,
Bradley



On Mon, Mar 6, 2017 at 3:50 AM, Martin Maechler 
wrote:

> > Bradley Broom 
> > on Sun, 5 Mar 2017 16:03:30 -0600 writes:
>
> > Please add me to R bugzilla.  Thanks, Bradley
>
> Well, I will not do it just like that (mean "after such a
> minimal message").
>
> I don't see any evidence as to your credentials, knowledge of R,
> etc, as part of this request.  We are all professionals,
> devoting part of our (work and free) time to the R project
> (rather than employees of the company you paid to serve you ...)
>
> It may be that you have read   https://www.r-project.org/bugs.html
>
> Notably this part
>
> --> NOTE: due to abuse by spammers, since 2016-07-09 only users who have
> previously submitted bugs can submit new ones on R’s Bugzilla. We’re
> working on a better system… In the mean time, post (e-mail) to R-devel or
> ask an R Core member to add you manually to R’s Bugzilla members.
>
> The last sentence was *meant* to say you should post (possibly
> parts, ideally a minimal reproducible example of) your bug
> report to R-devel so others could comment on it, agree or
> disagree with your assessment etc,
> __or__ ask an R-core member to add you to bugzilla (if you really read the
> other parts of the 'R bugs' web page above).
>
> Posting to all 1000 R-devel readers with no content about what
> you consider a bug  is a waste of bandwidth for at least 99% of
> these readers.
>
> [Yes, I'm also using their time ... in the hope to *improve* the
>  quality of future such postings].
>
> Martin Maechler
> ETH Zurich
>
Index: src/library/stats/R/dendrogram.R
===
--- src/library/stats/R/dendrogram.R	(revision 72314)
+++ src/library/stats/R/dendrogram.R	(working copy)
@@ -81,60 +81,130 @@
 structure(z[[as.character(k)]], class = "dendrogram")
 }
 
+# Count the number of leaves in a dendrogram.
+nleaves <- function (node) {
+if (is.leaf(node)) { return (1L) }
+todo <- NULL # Non-leaf nodes to traverse after this one.
+count <- 0L
+repeat {
+# For each child: count iff a leaf, add to todo list otherwise.
+	while (length(node)) {
+	child <- node[[1L]];
+node <- node[-1L];
+	if (is.leaf(child)) {
+count <- count + 1L
+} else {
+		todo <- list(node=child, todo=todo)
+	}
+	}
+# Advance to next node, terminating when no nodes left to count.
+	if (is.null(todo)) {
+break
+	} else {
+	node <- todo$node
+	todo <- todo$todo
+}
+}
+return (count)
+}
+
 ## Reversing the above (as much as possible)
 ## is only possible for dendrograms with *binary* splits
 as.hclust.dendrogram <- function(x, ...)
 {
-stopifnot(is.list(x), length(x) == 2)
-n <- length(ord <- as.integer(unlist(x)))
+stopifnot(is.list(x), length(x) == 2L)
+n <- nleaves(x)
+stopifnot(n == attr(x, "members"))
+
+# Ord and labels for each leaf node (in preorder).
+ord <- integer(n)
+labsu <- character(n)
+
+# Height and (parent,index) for each internal node (in preorder).
+n.h <- n - 1L
+height <- numeric(n.h)
+myIdx 

Re: [Rd] Please add me to bugzilla

2017-03-06 Thread Martin Maechler
> Bradley Broom 
> on Sun, 5 Mar 2017 16:03:30 -0600 writes:

> Please add me to R bugzilla.  Thanks, Bradley

Well, I will not do it just like that (mean "after such a
minimal message").

I don't see any evidence as to your credentials, knowledge of R,
etc, as part of this request.  We are all professionals,
devoting part of our (work and free) time to the R project
(rather than employees of the company you paid to serve you ...)

It may be that you have read   https://www.r-project.org/bugs.html

Notably this part

--> NOTE: due to abuse by spammers, since 2016-07-09 only users who have 
previously submitted bugs can submit new ones on R’s Bugzilla. We’re working on 
a better system… In the mean time, post (e-mail) to R-devel or ask an R Core 
member to add you manually to R’s Bugzilla members.

The last sentence was *meant* to say you should post (possibly
parts, ideally a minimal reproducible example of) your bug
report to R-devel so others could comment on it, agree or
disagree with your assessment etc,
__or__ ask an R-core member to add you to bugzilla (if you really read the
other parts of the 'R bugs' web page above).

Posting to all 1000 R-devel readers with no content about what
you consider a bug  is a waste of bandwidth for at least 99% of
these readers.

[Yes, I'm also using their time ... in the hope to *improve* the
 quality of future such postings].

Martin Maechler
ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Please add me to bugzilla

2017-03-05 Thread Bradley Broom
Please add me to R bugzilla.

Thanks,
Bradley

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel