>>>>> Juan Telleria Ruiz de Aguirre 
>>>>>     on Tue, 31 Jul 2018 08:19:33 +0200 writes:

    > I polished a little bit more the function:
    > * Used:  getOption("max.print")
    > * Added comment at the end:  cat('[ reached getOption("max.print") --
    > omitted ', omitted,' rows ]')

    > I polished a little bit more the function:

    > * Used:  getOption("max.print")
    > * Added comment at the end:  cat('[ reached getOption("max.print") --
    > omitted ', omitted,' rows ]')

and before

     > I would like to propose a simple optimization for print.data.frame
     > base function:
     > 
     > To add: x <- as.data.frame(head(x, n = options("max.print")))
     > 
     > This would prevent that, if for example, we have a 10GB data.frame
     > (e.g.: Instead of a data.table), and we accidentally print it, the R
     > Session does not "collapse", forcing us to press ESC or kill the
     > RSession.

Thank you, Juan.
You are right: The whole idea of introducing the 'max.print'
option (and the corresponding 'max' argument in print.default()
       {and print.Date() currently })
was that print() ing should not use too much resources.

and you are also right to use 'max.print' .. but R should be as
functional a language as sensible, and hence print(<data.frame>)
should be getting an argument 'max' which by default is equal to
the "max.print" option.

Also, any good citizen print() method *must* return its argument invisibly.
==> you are not supposed to change 'x' here.

But I entirely agree with your basic intuition for the problem
resolution.  Very good, thank you, indeed!

I'm currently running 'make check-all'  with the following change
to the source code (aka "patch") :

===================================================================
--- src/library/base/R/dataframe.R      (revision 75016)
+++ src/library/base/R/dataframe.R      (working copy)
@@ -1477,7 +1477,7 @@
 
 print.data.frame <-
     function(x, ..., digits = NULL, quote = FALSE, right = TRUE,
-            row.names = TRUE)
+            row.names = TRUE, max = NULL)
 {
     n <- length(row.names(x))
     if(length(x) == 0L) {
@@ -1489,12 +1489,19 @@
        print.default(names(x), quote = FALSE)
        cat(gettext("<0 rows> (or 0-length row.names)\n"))
     } else {
+       if(is.null(max)) max <- getOption("max.print", 99999L)
        ## format.<*>() : avoiding picking up e.g. format.AsIs
-       m <- as.matrix(format.data.frame(x, digits = digits, na.encode = FALSE))
+       omit <- (n0 <- max %/% length(x)) < n
+       m <- as.matrix(
+           format.data.frame(if(omit) x[seq_len(n0), , drop=FALSE] else x,
+                             digits = digits, na.encode = FALSE))
        if(!isTRUE(row.names))
            dimnames(m)[[1L]] <-
                if(isFALSE(row.names)) rep.int("", n) else row.names
        print(m, ..., quote = quote, right = right)
+       if(omit)
+           cat(" [ reached 'max' / getOption(\"max.print\") -- omitted",
+               n - n0, "rows ]\n")
     }
     invisible(x)
 }

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to