Re: [Rd] Unique ID for conditions to supress/rethrow selected conditions?

2023-04-16 Thread nos...@altfeld-im.de
On Sun, 2023-04-16 at 13:52 +0200, Iñaki Ucar wrote:

> I agree that something like this would be a nice addition. With the
> current condition system, it would be certainly easy (but quite a lot
> of work) to define a hierarchy of built-in conditions, and then use
> them consistently throughout base R.

Yes, a typed condition system would be great.

I have two other ideas:



By reading the "R messages" and "preparing translactions" sections of the "R 
extensions manual"

https://cran.r-project.org/doc/manuals/r-release/R-exts.html#R-messages

I was thinking about using the "unique" R message texts (which are the msgid in 
the *.po files,
see e.g. 
https://github.com/r-devel/r-svn/blob/60a4db2171835067999e96fd2751b6b42c6a6ebc/src/library/base/po/de.po#L892)
to maintain a unique ID (not dependent on the actual translation into the 
current language).

A "simple" solution could be to pre- or postfix each message text with an ID, 
for example this code here

 else errorcall(call, _("non-numeric argument to function"));
 # 
https://github.com/r-devel/r-svn/blob/49597237842697595755415cf9147da26c8d1088/src/main/complex.c#L347

would become

 else errorcall(call, _("non-numeric argument to function [47]"));
or
 else errorcall(call, _("[47] non-numeric argument to function"));

Now the ID could be extracted more easily (at least for base R condition 
messages)...

This would even be back-portable to older R versions to make condition IDs 
broadly available "in the wild".



Another way to introduce an ID for each condition in base R would be ("the hard 
way")

1) by refactoring each and every code location with an embedded message string 
to use a centralized
   key/msg_text data structure to "look up" the appropriate message text and

2) use the key to enrich the condition as unique ID (e.g. as an attribute in 
the condition object).

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Unique ID for conditions to supress/rethrow selected conditions?

2023-04-16 Thread nos...@altfeld-im.de
I am the author of the *tryCatchLog* package and want to

- suppress selected conditions (warnings and messages)
- rethrow  selected conditions (e.g a specific warning as a message or to 
"rename" the condition text).

I could not find any reliable unique identifier for each possible condition

- that (base) R throws
- that 3rd-party packages can throw (out of scope here).



Is there any reliable way to identify each possible condition of base R?

Are there plans to implement such an identifier ("errno")?



PS: Things that do not work good enough IMHO:

1. Just use the condition classes (not really unique to distiguish between 
each and every condition))

2. Try to match the condition text
   (it depends on the active language setting in R which cannot be switched 
"on the fly" on each platform
and wordings or translations may even change in the future)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Slow try in combination with do.call

2021-10-12 Thread nos...@altfeld-im.de
In fact an attentive user reported the same type of (slow due to deparse) 
problem in may tryCatchLog package recently when using a large sparse matrix

https://github.com/aryoda/tryCatchLog/issues/68

and I have fixed it by explicitly using the nlines arg of deparse() instead of 
using as.character()
which implicitly calls deparse() for a call stack.

Looking for a fix I think I may have found inconsistent deparse default 
arguments in base R between as.character() and deparse():

A direct deparse call in R uses
control = c("keepNA", "keepInteger", "niceNames", "showAttributes")
as default (see ?.deparseOpts for details).

The as.character() implementation in the C code of base R calls the 
internal deparse C function
with another default for .deparseOpts:
The SIMPLEDEPARSE C constant which corresponds to control = NULL.

https://github.com/wch/r-source/blob/54f94f0433c487fe3b0df9bae477c9babdd1/src/main/deparse.c#L345

This is clearly no bug but maybe the as.character() implementation should use 
the default args of deparse() for consistency (just a proposal!)...

BTW: You can find my analysis result with the call path and links to the R 
source code in the github issue:
 https://github.com/aryoda/tryCatchLog/issues/68#issuecomment-930593002



On Thu, 2021-09-16 at 18:04 +0200, Martin Maechler wrote:
> > > > > > Martin Maechler 
> > > > > > on Thu, 16 Sep 2021 17:48:41 +0200 writes:
> > > > > > Alexander Kaever 
> > > > > > on Thu, 16 Sep 2021 14:00:03 + writes:
> 
> >> Hi,
> >> It seems like a try(do.call(f, args)) can be very slow on error 
> depending on the args size. This is related to a complete deparse of the call
> using deparse(call)[1L] within the try function. How about replacing 
> deparse(call)[1L] by deparse(call, nlines = 1)?
> 
> >> Best,
> >> Alex
> 
> > an *excellent* idea!
> 
> > I have checked that the resulting try() object continues to contain the
> > long large call; indeed that is not the problem, but the
> > deparse()ing  *is* as you say above.
> 
> > {The experts typically use  tryCatch() directly, instead of  try() ,
> > which may be the reason other experienced R developers have not
> > stumbled over this ...}
> 
> > Thanks a lot, notably also for the clear  repr.ex. below.
> 
> > Best regards,
> > Martin
> 
> OTOH, I find so many cases  of   deparse(*)[1]  (or similar) in
> R's own sources, I'm wondering
> if I'm forgetting something ... and using nlines=* is not always
> faster & equivalent and hence better ??
> 
> Martin
> 
> 
> 
> 
> >> Example:
> 
> >> fun <- function(x) {
> >> stop("testing")
> >> }
> >> d <- rep(list(mtcars), 1)
> >> object.size(d)
> >> # 72MB
> 
> >> system.time({
> >> try(do.call(fun, args = list(x = d)))
> >> })
> >> # 8s
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R 4.0.2 64-bit Windows hangs

2020-08-21 Thread nos...@altfeld-im.de
May be unrelated but on SO there is a report that a Windows update may cause 
this problem:

https://stackoverflow.com/questions/63457321/r-will-not-run-after-latest-windows-10-updates/63524608#63524608



On Fri, 2020-08-21 at 12:34 +, m1388m+moe1ydyn0hbs--- via R-devel wrote:
> I am having exactly the same issue as the following bug report: 
> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16515.
> 
> RTerm.exe hangs on startup, nothing is printed to the terminal. 32-bit 
> RTerm.exe runs fine.
> 
> No errors are displayed, but I see the same as the bug report in Event Viewer.
> 
> I am running Windows 10 64-bit, v2010.
> 
> 
> 
> 
> 
> 
> Sent using Guerrillamail.com
> Block or report abuse: 
> https://www.guerrillamail.com//abuse/?a=UwxwABsFT5QHxR6m%2F3QacQCJQtiX
> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Guidelines when to use LF vs CRLF ("\n" vs. "\r\n") on Windows for new lines (line endings)?

2020-07-25 Thread nos...@altfeld-im.de
Dear R developers,

I am developing an R package which returns strings with new line codes.
I am not sure if I should use "\r\n" or "\n" in my returned strings on Windows 
platforms.

What is the recommended best practice for package developers (and code in base 
R) for coding new lines in strings?

And just out of curiosity: What is the reason (or history) for preferring "\n" 
in R even on Windows (see examples below)?

Best regards

Jürgen

PS: Examples from base R:

R seems to use (almost) only "\n" for new lines internally - even on Windows 
platforms, eg.:

charToRaw(paste0("a", "\n", "b"))
[1] 61 0a 62

# eol default is "\n"
write.table(x, file = "", append = FALSE, quote = TRUE, sep = " ",
eol = "\n", na = "NA", dec = ".", row.names = TRUE,
col.names = TRUE, qmethod = c("escape", "double"),
fileEncoding = "")

On the other hand some external interfaces require Windows-style new lines 
("\r\n"), eg. text file outputs seen ti care internally:

writeLines(text, con = stdout(), sep = "\n", useBytes = FALSE)
# Excerpt from the documentation:
# Normally writeLines is used with a text-mode connection,
# and the default separator is converted to the normal separator
# for that platform (LF on Unix/Linux, CRLF on Windows).

# calls internally do_writelines():
# 
https://github.com/wch/r-source/blob/8db7b85953127f364f52d201ec057911db4601e5/src/main/connections.c#L4023
# But: Where is the conversion done (hidden in the call to Riconv()?)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Why does INT 3 (opcode 0xCC) SIGTRAP break to debugger (gdb) in Rgui.exe and Rterm.exe but NOT in R.exe on Windows (64 bit)?

2019-12-11 Thread nos...@altfeld-im.de
I am developing a package to improve the debugging of Rcpp (C++) and SEXP based 
C code in gdb
by providing convenience print, subset and other functions:

https://github.com/aryoda/R_CppDebugHelper

I also want to solve the Windows-only problem that you can break into the 
debugger from R
only via Rgui.exe (menu "Misc > break to debugger") by supporting breakpoints 
for R.exe.

I want breakpoints support in R.exe because debugging in Rgui.exe has an 
unwanted side effect:

https://stackoverflow.com/questions/59236579/gdb-prints-output-stdout-to-rgui-console-instead-of-gdb-console-on-windows-whe

My idea is to break into the debugger from R.exe by calling a little C(++) code 
that contains an INT 3 (opcode 0xCC) SIGTRAP code:

// break_to_debugger.cpp
// [[Rcpp::export]]
int break_to_debugger()
{
  int a = 3;
  asm("int $3");  // this code line shall break into the debugger
  // Idea taken from "Rgui > break into debugger":
  // 
https://github.com/wch/r-source/blob/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/gnuwin32/rui.c#L431
  a++;
  return a;
}

# breakpoint.R
#' breaks the execution into the debugger
#'
#' @return
#' @export
breakpoint <- function() {
  break_to_debugger()
}

Surprisingly this works not only on Linux but also on Windows (v10, x64 
architecture = 64 bit) in Rterm.exe,
but NOT for R.exe (64 bit):

- Rgui.exe:Works
- Rscript.exe: Works
- R.exe:   Does not work: R.exe is exited with:
   [Inferior 1 (process 20704) exited with code 0203]

Can you please help me to understand why it works for Rgui.exe and Rscript.exe 
but not for R.exe?

Why is int 3 exiting R.exe?

And: How could I make it also work with R.exe?

Thanks a lot for sharing your ideas and experiences!

Jürgen

PS 1: My sessionInfo():
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

PS 2: My package "CppDebugHelper" was compiled with -g -o0 -std=c++11

PS 3: Here is my captured gdb output for the three test cases:

1. Rgui.exe 


>gdb --quiet --args Rgui.exe --silent --vanilla
Reading symbols from Rgui.exe...(no debugging symbols found)...done.
(gdb) run
Starting program: C:\R\bin\x64\Rgui.exe --silent --vanilla
[New Thread 14476.0x3710]
[New Thread 14476.0x284c]
[New Thread 14476.0x50ec]
[New Thread 14476.0x2d24]
warning: Invalid parameter passed to C runtime function.
[In RGui's R console:]
library(CppDebugHelper)
breakpoint()
[in gdb again:]
Program received signal SIGTRAP, Trace/breakpoint trap.
break_to_debugger () at break_to_debugger.cpp:33
33a++;
(gdb) b debug_example_rcpp
Breakpoint 1 at 0x66ac6846: file debug_example_rcpp.cpp, line 13.
(gdb) continue
Continuing.
[In RGui's R console:]
debug_example_rcpp()
[in gdb again:]
Breakpoint 1, debug_example_rcpp () at debug_example_rcpp.cpp:13
13  CharacterVector cv   = CharacterVector::create("foo", "bar", 
NA_STRING, "hello")  ;
(gdb) next
14  NumericVector nv = NumericVector::create(0.0, 1.0, NA_REAL, 10) 
;
(gdb) n
16  DateVector dv= DateVector::create( 14974, 14975, 15123, 
NA_REAL); // TODO how to use real dates instead?
(gdb) n
17  DateVector dv2   = DateVector::create(Date("2010-12-31"), 
Date("01.01.2011", "%d.%m.%Y"), Date(2011, 05, 29),
NA_REAL);
(gdb) n
18  DatetimeVector dtv   = DatetimeVector::create(1293753600, 
Datetime("2011-01-01"), Datetime("2011-05-29 10:15:30")
, NA_REAL);
(gdb) n
19  DataFrame df = DataFrame::create(Named("name1") = cv, 
_["value1"] = nv, _["dv2"] = dv2);  // Named and _[
] are the same
(gdb) n
20  CharacterVector col1 = df["name1"];  // get the first column
(gdb) call dbg_print(df)
(gdb) call dbg_str(df)
(gdb) continue
Continuing.

[Output for the dbg_* function calls is printed to Rgui's R console (NOT the 
gdb terminal!):]

  name1 value1dv2
1   foo  0 2010-12-31
2   bar  1 2011-01-01
3   NA 2011-05-29
4 hello 10   

'data.frame':   4 obs. of  3 variables:
$ name1 : Factor w/ 3 levels "bar","foo","hello": 2 1 NA 3
$ value1: num  0 1 NA 10
$ dv2   : Date, format: "2010-12-31" "2011-01-01" ...



2. R.exe 


>gdb --quiet --args R.exe --silent --vanilla
Reading symbols from R.exe...(no debugging symbols found)...done.
(gdb) r
Starting program: C:\R\bin\x64\R.exe --silent --vanilla
[New Thread 20704.0x2b20]
[New Thread 20704.0x4c08]
[New Thread 20704.0x425c]
[New Thread 20704.0x45f8]
> library(CppDebugHelper)
> breakpoint()
[Thread 20704.0x45f8 exited with code 2147483651]
[Thread 20704.0x425c exited with code 2147483651]
[Thread 20704.0x4c08 exited with code 2147483651]
[Inferior 1 (process 20704) exited with code 0203]
(gdb) bt
No stack.
(gdb)



3. Rterm.exe 


gdb --quiet 

Re: [Rd] typeof(getOption("warn")) is "integer" instead of "double" in R unstable (2019-09-27 r77229)? Reproducible?

2019-09-29 Thread nos...@altfeld-im.de
Thanks a lot for pointing out the reason
(and yes, I am testing quite to stringent in this case - it's my old testing 
disease ;-)

For other readers:

The R-devel NEWS is a good source to find possible change reasons:

https://stat.ethz.ch/R-manual/R-devel/doc/html/NEWS.html


On Sun, 2019-09-29 at 08:33 -0400, Duncan Murdoch wrote:
> On 29/09/2019 7:55 a.m., nos...@altfeld-im.de wrote:
> > Hi,
> > 
> > I have a failing unit test in my package tryCatchLog on the CRAN build 
> > infrastructure
> > (https://cran.r-project.org/web/checks/check_results_tryCatchLog.html)
> > with "R Under development (unstable) (2019-09-27 r77229)"
> > and the unit tests just ensures consistent behaviour of R (not of my 
> > package) as a precondition:
> > 
> > The failing unit test is caused by
> > > typeof(getOption("warn"))
> > > [1] "integer"
> > 
> > but it should be
> > > [1] "double"
> 
> This is related to this bug fix:
> 
> CHANGES IN R 3.6.1 patched BUG FIXES
> 
>  ‘options(warn=1e11)’ is an error now, instead of later leading to C 
> stack overflow because of infinite recursion.
> 
> which occurred in rev 77226.  It explicitly coerces the warn value to 
> integer.
> 
> 
> > I have no build infrastructure for dev and want to find out if this is 
> > caused by
> > - my mistake
> > - changes in the R dev version
> > - the new C compilers used (correlates with the failing unit test)
> 
> It is changes in the dev and patched versions, and also your mistake: 
> your test shouldn't be so stringent.  The docs don't say that the value 
> has to be a double; in fact, they suggest it should be a whole number 
> value (talking about 0, 1, "2 or more", not about what would happen with 
> options(warn = pi/2), for example.
> 
> In older versions, options(warn = pi/2) is treated the same as 
> options(warn = 1), and in the new version, it is displayed as 1 as well.
> 
> Duncan Murdoch
> > 
> > Can somebody (having the R dev version available) please help me and answer 
> > the result of
> > 
> > > typeof(getOption("warn"))
> > 
> > using "R Under development (unstable) (2019-09-27 r77229)" or newer?
> > 
> > Thanks a lot and sorry for the "noise"!
> > 
> > Jurgen
> > 
> > PS: These R (dev) versions did work as expected (returning "double") but 
> > were also using older C compilers:
> > - R Under development (unstable) (2019-09-20 r77199)
> > - R Under development (unstable) (2019-09-22 r77202)
> > - R Under development (unstable) (2019-09-25 r77217)
> > - R version 3.6.1 Patched (2019-09-25 r77224)
> > - R version 3.6.1 (2019-07-05)
> > - R version 3.6.0 beta (2019-04-15 r76395)
> > - R version 3.5.3 (2019-03-11)
> > - R version 3.5.2 (2018-12-20)
> > 
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> > 
> 
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] typeof(getOption("warn")) is "integer" instead of "double" in R unstable (2019-09-27 r77229)? Reproducible?

2019-09-29 Thread nos...@altfeld-im.de
Hi,

I have a failing unit test in my package tryCatchLog on the CRAN build 
infrastructure
(https://cran.r-project.org/web/checks/check_results_tryCatchLog.html)
with "R Under development (unstable) (2019-09-27 r77229)"
and the unit tests just ensures consistent behaviour of R (not of my package) 
as a precondition:

The failing unit test is caused by
> typeof(getOption("warn"))
> [1] "integer"

but it should be
> [1] "double"

I have no build infrastructure for dev and want to find out if this is caused by
- my mistake
- changes in the R dev version
- the new C compilers used (correlates with the failing unit test)

Can somebody (having the R dev version available) please help me and answer the 
result of

> typeof(getOption("warn"))

using "R Under development (unstable) (2019-09-27 r77229)" or newer?

Thanks a lot and sorry for the "noise"!

Jurgen

PS: These R (dev) versions did work as expected (returning "double") but were 
also using older C compilers:
- R Under development (unstable) (2019-09-20 r77199)
- R Under development (unstable) (2019-09-22 r77202)
- R Under development (unstable) (2019-09-25 r77217)
- R version 3.6.1 Patched (2019-09-25 r77224)
- R version 3.6.1 (2019-07-05)
- R version 3.6.0 beta (2019-04-15 r76395)
- R version 3.5.3 (2019-03-11)
- R version 3.5.2 (2018-12-20)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem building rmarkdown vignettes with child

2018-11-10 Thread nos...@altfeld-im.de
Which R version are you using to produce the problem?

A few first indications:

- The regex in ".install_extras" does not match your file endings: Change 
"Rmd_tmp$" into "Rmd_t$"

- Try "output: rmarkdown::html_vignette" instead of "output: html_document"
  in the header of the file "ABVignetteWithLocalChild.Rmd" (and possibly other 
"*.Rmd"s)

- Try to specify the child doc name directly in the chunks via "```{r child = 
"NoBuildVignette.Rmd_t"}"
  instead of "```{r includChild, child = child_docs}"
  Note the possible typo in the tag "includChild" (-> "includeChild"?) (and 
possibly other "*.Rmd"s)

PS: You can find a working example of child Rmds for a CRAN package here:

https://github.com/aryoda/tryCatchLog/tree/master/vignettes




On Wed, 2018-11-07 at 13:33 +0100, Witold E Wolski wrote:
> Hello,
> 
> This is a problem I posted about already some time ago:
> https://stat.ethz.ch/pipermail/r-devel/2018-September/076786.html
> 
> Finally, I did had some time to create a minimal package to reproduce
> the problem that vignettes with child can not be build.
> https://github.com/wolski/RmarkdownVignetteProblem
> 
> The problem basically is that while all the vignettes can be build by running
> 
> devtools::build_vignettes
> or
> rmarkdown::render
> 
> they will all fail to build when running
> devtools::build()
> or
> R CMD build
> 
> except of the
> ABVignetteWithLocalChild.Rmd
> for which I did apply the workaround suggested by  Duncan in this github 
> issue:
> https://github.com/yihui/knitr/issues/1540
> 
> 
> Best regards
> Witek
> 
> 
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Missing objects using dump.frames for post-mortem debugging of crashed batch jobs. Bug or gap in documentation?

2016-11-14 Thread nos...@altfeld-im.de
Martin, thanks for the good news and sorry for wasting your (and others
time) by not doing my homework and query bugzilla first (lesson learned!
).

I have tested the new implementation from R-devel and observe a semantic
difference when playing with the parameters:

  # Test script 1
  g <- "global"
  f <- function(p) {
l <- "local"
dump.frames()
  }
  f("parameter")

results in
  # > debugger()
  # Message:  object 'server' not foundAvailable environments had calls:
  # 1: source("~/.active-rstudio-document", echo = TRUE)
  # 2: withVisible(eval(ei, envir))
  # 3: eval(ei, envir)
  # 4: eval(expr, envir, enclos)
  # 5: .active-rstudio-document#9: f("parameter")
  # 
  # Enter an environment number, or 0 to exit  
  # Selection: 5
  # Browsing in the environment with call:
  #   .active-rstudio-document#9: f("parameter")
  # Called from: debugger.look(ind)
  # Browse[1]> g
  # [1] "global"
  # Browse[1]> 

while dumping to a file

  # Test script 2
  g <- "global"
  f <- function(p) {
l <- "local"
dump.frames(to.file = TRUE, include.GlobalEnv = TRUE)
  }
  f("parameter")

results in
  # > load("last.dump.rda")
  # > debugger()
  # Message:  object 'server' not foundAvailable environments had calls:
  # 1: .GlobalEnv
  # 2: source("~/.active-rstudio-document", echo = TRUE)
  # 3: withVisible(eval(ei, envir))
  # 4: eval(ei, envir)
  # 5: eval(expr, envir, enclos)
  # 6: .active-rstudio-document#11: f("parameter")
  # 
  # Enter an environment number, or 0 to exit  
  # Selection: 6
  # Browsing in the environment with call:
  #   .active-rstudio-document#11: f("parameter")
  # Called from: debugger.look(ind)
  # Browse[1]> g
  # Error: object 'g' not found
  # Browse[1]> 

The semantic difference is that the global variable "g" is visible
within the function "f" in the first version, but not in the second
version.

If I dump to a file and load and debug it then the search path through
the
frames is not the same during run time vs. debug time.

An implementation with the same semantics could be achieved
by applying this workaround currently:

  dump.frames()
  save.image(file = "last.dump.rda")

Does it possibly make sense to unify the semantics?

THX!


On Mon, 2016-11-14 at 11:34 +0100, Martin Maechler wrote:
> >>>>> nospam@altfeld-im de <nos...@altfeld-im.de>
> >>>>> on Sun, 13 Nov 2016 13:11:38 +0100 writes:
> 
> > Dear R friends, to allow post-mortem debugging In my
> > Rscript based batch jobs I use
> 
> >tryCatch( , error = function(e) {
> > dump.frames(to.file = TRUE) })
> 
> > to write the called frames into a dump file.
> 
> > This is similar to the method recommended in the "Writing
> > R extensions" manual in section 4.2 Debugging R code (page
> > 96):
> 
> > https://cran.r-project.org/doc/manuals/R-exts.pdf
> 
> >> options(error = quote({dump.frames(to.file=TRUE); q()}))
> 
> 
> 
> > When I load the dump later in a new R session to examine
> > the error I use
> 
> > load(file = "last.dump.rda") debugger(last.dump)
> 
> > My problem is that the global objects in the workspace are
> > NOT contained in the dump since "dump.frames" does not
> > save the workspace.
> 
> > This makes debugging difficult.
> 
> 
> 
> > For more details see the stackoverflow question + answer
> > in:
> > 
> https://stackoverflow.com/questions/40421552/r-how-make-dump-frames-include-all-variables-for-later-post-mortem-debugging/40431711#40431711
> 
> 
> 
> > I think the reason of the problem is:
> > 
> 
> > If you use dump.files(to.file = FALSE) in an interactive
> > session debugging works as expected because it creates a
> > global variable called "last.dump" and the workspace is
> > still loaded.
> 
> > In the batch job scenario however the workspace is NOT
> > saved in the dump and therefore lost if you debug the dump
> > in a new session.
> 
> 
> > Options to solve the issue:
> > --
> 
> > 1. Improve the documentation of the R help for
> > "dump.frames" and the R_exts manual to propose another
> > code snippet for batch job scenarios:
> 
> >   dump.frames() save.image(file = "last.dump.rda")
> 
> > 2. Change the semantics of "dump.frames(to.file = TRUE)"
> > to inc

[Rd] Missing objects using dump.frames for post-mortem debugging of crashed batch jobs. Bug or gap in documentation?

2016-11-13 Thread nos...@altfeld-im.de
Dear R friends,

to allow post-mortem debugging In my Rscript based batch jobs I use

   tryCatch( ,
  error = function(e)
  {
dump.frames(to.file = TRUE)
  }) 

to write the called frames into a dump file.

This is similar to the method recommended in the "Writing R extensions"
manual in section 4.2 Debugging R code (page 96):

https://cran.r-project.org/doc/manuals/R-exts.pdf

> options(error = quote({dump.frames(to.file=TRUE); q()}))



When I load the dump later in a new R session to examine the error I use

load(file = "last.dump.rda")
debugger(last.dump)

My problem is that the global objects in the workspace are NOT contained
in the dump since "dump.frames" does not save the workspace.

This makes debugging difficult.



For more details see the stackoverflow question + answer in:
https://stackoverflow.com/questions/40421552/r-how-make-dump-frames-include-all-variables-for-later-post-mortem-debugging/40431711#40431711



I think the reason of the problem is:


If you use dump.files(to.file = FALSE) in an interactive session
debugging works as expected because it creates a global variable called
"last.dump" and the workspace is still loaded.

In the batch job scenario however the workspace is NOT saved in the dump
and therefore lost if you debug the dump in a new session.


Options to solve the issue:
--

1. Improve the documentation of the R help for "dump.frames" and the
   R_exts manual to propose another code snippet for batch
   job scenarios:

  dump.frames()
  save.image(file = "last.dump.rda")

2. Change the semantics of "dump.frames(to.file = TRUE)" to include
   the workspace in the dump.
   This would change the semantics implied by the function name
   but makes the semantics consistent for both "to.file" param values.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)

2016-02-23 Thread nos...@altfeld-im.de
Excellent analysis, thank you both for the quick reply!

Is there anything I can do to get the bug fixed in the next version of R
(e. g. filing a bug report at https://bugs.r-project.org/bugzilla3/)?


On Tue, 2016-02-23 at 14:06 +0200, Mikko Korpela wrote:
> On 23.02.2016 11:37, Martin Maechler wrote:
> >>>>>> nospam@altfeld-im de <nos...@altfeld-im.de>
> >>>>>> on Mon, 22 Feb 2016 18:45:59 +0100 writes:
> > 
> > > Dear R developers
> > > I think I have found a bug that can be reproduced with two lines of 
> > code
> > > and I am very thankful to get your first assessment or feed-back on my
> > > report.
> > 
> > > If this is the wrong mailing list or I did something wrong
> > > (e. g. semi "anonymous" email address to protect my privacy and defend
> > > unwanted spam) please let me know since I am new here.
> > 
> > > Thank you very much :-)
> > 
> > > J. Altfeld
> > 
> > Dear J.,
> > (yes, a bit less anonymity would be very welcomed here!),
> > 
> > You are right, this is a bug, at least in the documentation, but
> > probably "all real", indeed,
> > 
> > but read on.
> > 
> > > On Tue, 2016-02-16 at 18:25 +0100, nos...@altfeld-im.de wrote:
> > >> 
> > >> 
> > >> If I execute the code from the "?write.table" examples section
> > >> 
> > >> x <- data.frame(a = I("a \" quote"), b = pi)
> > >> # (ommited code)
> > >> write.csv(x, file = "foo.csv", fileEncoding = "UTF-16LE")
> > >> 
> > >> the resulting CSV file has a size of 6 bytes which is too short
> > >> (truncated):
> > >> 
> > >> """,3
> > 
> > reproducibly, yes.
> > If you look at what write.csv does
> > and then simplify, you can get a similar wrong result by
> > 
> >   write.table(x, file = "foo.tab", fileEncoding = "UTF-16LE")
> > 
> > which results in a file with one line
> > 
> > """ 3
> > 
> > and if you debug  write.table() you see that its building blocks
> > here are
> >  file <- file(, encoding = fileEncoding)
> > 
> > awriteLines(*, file=file)  for the column headers,
> > 
> > and then "deeper down" C code which I did not investigate.
> 
> I took a look at connections.c. There is a call to strlen() that gets
> confused by null characters. I think the obvious fix is to avoid the
> call to strlen() as the size is already known:
> 
> Index: src/main/connections.c
> ===
> --- src/main/connections.c(revision 70213)
> +++ src/main/connections.c(working copy)
> @@ -369,7 +369,7 @@
>   /* is this safe? */
>   warning(_("invalid char string in output conversion"));
>   *ob = '\0';
> - con->write(outbuf, 1, strlen(outbuf), con);
> + con->write(outbuf, 1, ob - outbuf, con);
>   } while(again && inb > 0);  /* it seems some iconv signal -1 on
>  zero-length input */
>  } else
> 
> 
> > 
> > But just looking a bit at such a file() object with writeLines()
> > seems slightly revealing, as e.g., 'eol' does not seem to
> > "work" for this encoding:
> > 
> > > fn <- tempfile("ffoo"); ff <- file(fn, open="w", encoding = 
> > "UTF-16LE")
> > > writeLines(LETTERS[3:1], ff); writeLines("|", ff); writeLines(">a", 
> > ff)
> > > close(ff)
> > > file.show(fn)
> > CBA|>
> > > file.size(fn)
> > [1] 5
> > > 
> 
> With the patch applied:
> 
> > readLines(fn, encoding="UTF-16LE", skipNul=TRUE)
> [1] "C"  "B"  "A"  "|"  ">a"
> > file.size(fn)
> [1] 22
> 
> - Mikko Korpela
> 
> > >> The problem seems to be the iconv function:
> > >> 
> > >> iconv("foo", to="UTF-16")
> > >> 
> > >> produces
> > >> 
> > >> Error in iconv("foo", to = "UTF-16"):
> > >> embedded nul in string: '\xff\xfef\0o\0o\0'
> > 
> > but this works
> > 
> > > iconv("foo", to="UTF-16", toRaw=TRUE)
> > [[1]]
> > [1] ff fe 66 00 6f 00 6f 00
> > 
> > (indeed showing the embedded '\0's)
> > 
> > >> In 2010 a (partial) patch for this problem was submitted:
> > >> http://tolstoy.newcastle.edu.au/R/e10/devel/10/06/0648.html
> > 
> > the patch only related to the iconv() problem not allowing 'raw'
> > (instead of character) argument x.
> > 
> > ... and it is > 5.5 years old, for an iconv() version that was less
> > featureful than today.
> > Rather, current iconv(x) allows x to be a list of raw entries.
> > 
> > 
> > >> Are there chances to fix this problem since it prevents writing 
> > Windows
> > >> UTF-16LE text files?
> > 
> > >> 
> > >> PS: This problem can be reproduced on Windows and Linux.
> > 
> > indeed also on "R devel of today".
> > 
> > I agree it should be fixed... but as I said not by the patch you
> > mentioned.
> > 
> > Tested patches to fix this are welcome, indeed.
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)

2016-02-22 Thread nos...@altfeld-im.de
Dear R developers

I think I have found a bug that can be reproduced with two lines of code
and I am very thankful to get your first assessment or feed-back on my
report.

If this is the wrong mailing list or I did something wrong
(e. g. semi "anonymous" email address to protect my privacy and defend
unwanted spam) please let me know since I am new here.

Thank you very much :-)

J. Altfeld

On Tue, 2016-02-16 at 18:25 +0100, nos...@altfeld-im.de wrote:
> 
> 
> If I execute the code from the "?write.table" examples section
> 
>   x <- data.frame(a = I("a \" quote"), b = pi)
>   # (ommited code)
>   write.csv(x, file = "foo.csv", fileEncoding = "UTF-16LE")
> 
> the resulting CSV file has a size of 6 bytes which is too short
> (truncated):
> 
>   """,3
> 
> The problem seems to be the iconv function:
> 
>   iconv("foo", to="UTF-16")
> 
> produces
> 
>   Error in iconv("foo", to = "UTF-16"):
>   embedded nul in string: '\xff\xfef\0o\0o\0'
> 
> In 2010 a (partial) patch for this problem was submitted:
> 
> http://tolstoy.newcastle.edu.au/R/e10/devel/10/06/0648.html
> 
> Are there chances to fix this problem since it prevents writing Windows
> UTF-16LE text files?
> 
> 
> 
> PS: This problem can be reproduced on Windows and Linux.
> 
> ---
> 
> > sessionInfo()
> R version 3.2.3 (2015-12-10)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 14.04.3 LTS
> 
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
> LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
> LC_PAPER=en_US.UTF-8   LC_NAME=C 
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods
> base 
> 
> loaded via a namespace (and not attached):
> [1] tools_3.2.3
> >
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] iconv to UTF-16 encoding produces error due to embedded nulls (write.table with fileEncoding param)

2016-02-16 Thread nos...@altfeld-im.de



If I execute the code from the "?write.table" examples section

  x <- data.frame(a = I("a \" quote"), b = pi)
  # (ommited code)
  write.csv(x, file = "foo.csv", fileEncoding = "UTF-16LE")

the resulting CSV file has a size of 6 bytes which is too short
(truncated):

  """,3

The problem seems to be the iconv function:

  iconv("foo", to="UTF-16")

produces

  Error in iconv("foo", to = "UTF-16"):
  embedded nul in string: '\xff\xfef\0o\0o\0'

In 2010 a (partial) patch for this problem was submitted:

http://tolstoy.newcastle.edu.au/R/e10/devel/10/06/0648.html

Are there chances to fix this problem since it prevents writing Windows
UTF-16LE text files?



PS: This problem can be reproduced on Windows and Linux.

---

> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.3 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_US.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base 

loaded via a namespace (and not attached):
[1] tools_3.2.3
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel