Re: [R-pkg-devel] [Re] warning: type of ‘zhpevx_’ does not match original declaration [-Wlto-type-mismatch]

2020-12-17 Thread Pierre Lafaye de Micheaux
Dear Ivan,

Thank you for your comment. And also for the previous one.

I indeed made a mistake with JOBZ, RANGE and UPLO, now changed to:

const char *CJOBZ = jobz[0];
const char *CRANGE = range[0];
const char *CUPLO = uplo[0];
...
delete[] CJOBZ;
delete[] CRANGE;
delete[] CUPLO;

I did a bit of F77 programming almost 20 years ago when I did my Ph.D. thesis, 
but I never programmed in Fortran 2003. I remember having tried a few months 
back to use the iso_c_binding approach (before your previous email) with no 
success. I will give it a try to see if I can make it work.

Have a nice day.

Kind regards,
Pierre

From: Ivan Krylov 
Sent: Friday, 18 December 2020 18:16
To: Pierre Lafaye de Micheaux 
Cc: Tomas Kalibera ; Prof Brian Ripley 
; R Package Devel 
Subject: Re: [R-pkg-devel] [Re] warning: type of �zhpevx_� does not match 
original declaration [-Wlto-type-mismatch]

Dear Pierre L.,

I think that the zhpevxC wrapper, as written, may result in undefined
behaviour:

>const char *JOBZ = jobz[0];

>delete[] JOBZ;

>delete[] Cap;

This could work okay, depending on how the rest of the package is
written, but in general, it is considered a bad idea for linear algebra
routines to deallocate memory they didn't allocate. ("Pointer
ownership is usually retained by the calling code.")

May I suggest once again the idea of writing a Fortran 2003 wrapper
zhpevxC instead of C++? Subroutines defined using iso_c_binding are
guaranteed to follow the C calling convention, and, this being Fortran,
call zhpevx(...) is guaranteed to match the Fortran calling convention,
bringing you the best of both worlds:

https://stat.ethz.ch/pipermail/r-package-devel/2020q3/005710.html

No need to allocate or deallocate memory or provide different
definitions depending on the availability of FC_LEN_T, just make sure
that both prototypes mean the same thing. By the way,
std::complex is guaranteed to match the memory layout of C type
double _Complex and Fortran type complex(kind = c_double_complex) by
the respective standards.

--
Best regards,
Ivan

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] [Re] warning: type of ‘zhpevx_’ does not match original declaration [-Wlto-type-mismatch]

2020-12-17 Thread Ivan Krylov
Dear Pierre L.,

I think that the zhpevxC wrapper, as written, may result in undefined
behaviour:

>const char *JOBZ = jobz[0];

>delete[] JOBZ;

>delete[] Cap;

This could work okay, depending on how the rest of the package is
written, but in general, it is considered a bad idea for linear algebra
routines to deallocate memory they didn't allocate. ("Pointer
ownership is usually retained by the calling code.")

May I suggest once again the idea of writing a Fortran 2003 wrapper
zhpevxC instead of C++? Subroutines defined using iso_c_binding are
guaranteed to follow the C calling convention, and, this being Fortran,
call zhpevx(...) is guaranteed to match the Fortran calling convention,
bringing you the best of both worlds:

https://stat.ethz.ch/pipermail/r-package-devel/2020q3/005710.html

No need to allocate or deallocate memory or provide different
definitions depending on the availability of FC_LEN_T, just make sure
that both prototypes mean the same thing. By the way,
std::complex is guaranteed to match the memory layout of C type
double _Complex and Fortran type complex(kind = c_double_complex) by
the respective standards.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] [Re] warning: type of ‘zhpevx_’ does not match original declaration [-Wlto-type-mismatch]

2020-12-17 Thread Pierre Lafaye de Micheaux
Of course, I meant:

double _Complex *Cap;
memcpy(&Cap, &ap, sizeof(ap));

and not

double _Complex *Cap;
memcpy(&ap, &Cap, sizeof(ap));

Best regards,
Pierre

From: R-package-devel  on behalf of 
Pierre Lafaye de Micheaux 
Sent: Friday, 18 December 2020 16:15
To: Tomas Kalibera 
Cc: Prof Brian Ripley ; R Package Devel 

Subject: [R-pkg-devel] [Re] warning: type of ‘zhpevx_’ does not match original 
declaration [-Wlto-type-mismatch]

Dear Tomas,

Thank you very much for your feedback. This was really helpful and helped me to 
find a solution. Even if it is not entirely satisfactory to me (in the sense 
that I now create new extra pointers which will take unnecessary space in 
memory), at least the LTO errors are not displayed anymore with Gfortran 8.3.0. 
Hopefully the CRAN team will find it sufficient to allow me to put my package 
back on the CRAN.

I explain below what I did in case someone else faces the same problem.

First originally, I could not display these LTO errors on my system (Debian 10 
Buster, with Gfortran 8.3.0). I thus modified the file /etc/R/Makeconf by 
adding -flto after every -fpic I could find.

Second, I was confused by the warning message below:
myzhpevx.cpp:22:16: warning: type of ‘zhpevx_’ does not match original 
declaration [-Wlto-type-mismatch]
   void F77_NAME(zhpevx)(char *jobz, char *range, char *uplo,
Indeed, I thought that the problem was with the type of the function itself, 
not with the type of one of its arguments. Following your approach of creating 
a minimalist code (and playing with the arguments one by one), I convinced 
myself otherwise.

My new code is pasted below for convenience. I used this instruction:

double _Complex *Cap;
memcpy(&ap, &Cap, sizeof(ap));

and then in F77_NAME(zhpevx) I replaced
Rcomplex *ap
with
__complex__ double *Cap
and of course, I used Cap instead of ap in F77_CALL(zhpevx).

This trick removes the LTO warning (at the expense of three extra pointers; I 
had to do it for the three complex pointers ap, z and work).

Thank you once again for your precious help.

Kind regards,
Pierre L.

PS: I CC Professor B. Ripley since he was the one that originally contacted me 
about this problem, and in case this might trigger the need for a modification 
of something in R core or in its documentation (though probably this is just 
the result of my limited knowledge in Fortran and/or C).


#define USE_FC_LEN_T
#include 
#include "Rmath.h"


#ifdef FC_LEN_T

extern "C" {

  void zhpevxC(char **jobz, char **range, char **uplo, int *n, Rcomplex *ap,
  double *vl, double *vu, int *il, int *iu, double *abstol, int *m,
  double *w, Rcomplex *z, int *ldz, Rcomplex *work, double *rwork,
  int *iwork, int *ifail, int *info) {


char cjobz[2];
strncpy(cjobz, jobz[0], 1);
cjobz[1] = '\0';
char crange[2];
strncpy(crange, range[0], 1);
crange[1] = '\0';
char cuplo[2];
strncpy(cuplo, uplo[0], 1);
cuplo[1] = '\0';

double _Complex *Cap;
memcpy(&ap, &Cap, sizeof(ap));
double _Complex *Cz;
memcpy(&z, &Cz, sizeof(z));
double _Complex *Cwork;
memcpy(&work, &Cwork, sizeof(work));


void F77_NAME(zhpevx)(const char *jobz, const char *range, const char *uplo,
 const int *n, __complex__ double *Cap, const double *vl,
 const double *vu, const int *il, const int *iu,
 const double *abstol, int *m, double *w,
 __complex__ double *Cz, const int *ldz, __complex__ double *Cwork, double 
*rwork,
 int *iwork, int *ifail, int *info,
 FC_LEN_T jobz_len,  FC_LEN_T range_len,  FC_LEN_T uplo_len);



F77_CALL(zhpevx)(cjobz, crange, cuplo, &n[0], Cap, &vl[0], &vu[0], &il[0], 
&iu[0], &abstol[0], &m[0],
w, Cz, &ldz[0], Cwork, rwork, iwork, ifail, &info[0], strlen(cjobz), 
strlen(crange), strlen(cuplo));


delete[] Cap;
delete[] Cz;
delete[] Cwork;

  }

}
#else
extern "C" {

  void zhpevxC(char **jobz, char **range, char **uplo, int *n, Rcomplex *ap,
  double *vl, double *vu, int *il, int *iu, double *abstol, int *m,
  double *w, Rcomplex *z, int *ldz, Rcomplex *work, double *rwork,
  int *iwork, int *ifail, int *info) {

extern void F77_NAME(zhpevx)(const char *jobz, const char *range, const 
char *uplo,
const int *n, __complex__ double *Cap, const double *vl,
const double *vu, const int *il, const int *iu,
const double *abstol, int *m, double *w,
__complex__ double *Cz, const int *ldz, __complex__ double *Cwork, double 
*rwork,
int *iwork, int *ifail, int *info);

const char *JOBZ = jobz[0];
const char *RANGE = range[0];
const char *UPLO = uplo[0];

double _Complex *Cap;
memcpy(&ap, &Cap, sizeof(ap));
double _Complex *Cz;
memcpy(&z, &Cz, sizeof(z));
double _Complex *Cwork;
memcpy(&work, &Cwork, sizeof(work));

F77_CALL(zhpevx)(JOBZ, RANGE, UPLO, &n[0], Cap, &vl[0], &vu[0], &il[0], 
&iu[0], &abstol[0], &m[0],
w, Cz, &ldz[0], Cwork, rwork, iwork, ifail, &info[0]);


[R-pkg-devel] [Re] warning: type of ‘zhpevx_’ does not match original declaration [-Wlto-type-mismatch]

2020-12-17 Thread Pierre Lafaye de Micheaux
Dear Tomas,

Thank you very much for your feedback. This was really helpful and helped me to 
find a solution. Even if it is not entirely satisfactory to me (in the sense 
that I now create new extra pointers which will take unnecessary space in 
memory), at least the LTO errors are not displayed anymore with Gfortran 8.3.0. 
Hopefully the CRAN team will find it sufficient to allow me to put my package 
back on the CRAN.

I explain below what I did in case someone else faces the same problem.

First originally, I could not display these LTO errors on my system (Debian 10 
Buster, with Gfortran 8.3.0). I thus modified the file /etc/R/Makeconf by 
adding -flto after every -fpic I could find.

Second, I was confused by the warning message below:
myzhpevx.cpp:22:16: warning: type of ‘zhpevx_’ does not match original 
declaration [-Wlto-type-mismatch]
   void F77_NAME(zhpevx)(char *jobz, char *range, char *uplo,
Indeed, I thought that the problem was with the type of the function itself, 
not with the type of one of its arguments. Following your approach of creating 
a minimalist code (and playing with the arguments one by one), I convinced 
myself otherwise.

My new code is pasted below for convenience. I used this instruction:

double _Complex *Cap;
memcpy(&ap, &Cap, sizeof(ap));

and then in F77_NAME(zhpevx) I replaced
Rcomplex *ap
with
__complex__ double *Cap
and of course, I used Cap instead of ap in F77_CALL(zhpevx).

This trick removes the LTO warning (at the expense of three extra pointers; I 
had to do it for the three complex pointers ap, z and work).

Thank you once again for your precious help.

Kind regards,
Pierre L.

PS: I CC Professor B. Ripley since he was the one that originally contacted me 
about this problem, and in case this might trigger the need for a modification 
of something in R core or in its documentation (though probably this is just 
the result of my limited knowledge in Fortran and/or C).


#define USE_FC_LEN_T
#include 
#include "Rmath.h"


#ifdef FC_LEN_T

extern "C" {

  void zhpevxC(char **jobz, char **range, char **uplo, int *n, Rcomplex *ap,
  double *vl, double *vu, int *il, int *iu, double *abstol, int *m,
  double *w, Rcomplex *z, int *ldz, Rcomplex *work, double *rwork,
  int *iwork, int *ifail, int *info) {


char cjobz[2];
strncpy(cjobz, jobz[0], 1);
cjobz[1] = '\0';
char crange[2];
strncpy(crange, range[0], 1);
crange[1] = '\0';
char cuplo[2];
strncpy(cuplo, uplo[0], 1);
cuplo[1] = '\0';

double _Complex *Cap;
memcpy(&ap, &Cap, sizeof(ap));
double _Complex *Cz;
memcpy(&z, &Cz, sizeof(z));
double _Complex *Cwork;
memcpy(&work, &Cwork, sizeof(work));


void F77_NAME(zhpevx)(const char *jobz, const char *range, const char *uplo,
 const int *n, __complex__ double *Cap, const double *vl,
 const double *vu, const int *il, const int *iu,
 const double *abstol, int *m, double *w,
 __complex__ double *Cz, const int *ldz, __complex__ double *Cwork, double 
*rwork,
 int *iwork, int *ifail, int *info,
 FC_LEN_T jobz_len,  FC_LEN_T range_len,  FC_LEN_T uplo_len);



F77_CALL(zhpevx)(cjobz, crange, cuplo, &n[0], Cap, &vl[0], &vu[0], &il[0], 
&iu[0], &abstol[0], &m[0],
w, Cz, &ldz[0], Cwork, rwork, iwork, ifail, &info[0], strlen(cjobz), 
strlen(crange), strlen(cuplo));


delete[] Cap;
delete[] Cz;
delete[] Cwork;

  }

}
#else
extern "C" {

  void zhpevxC(char **jobz, char **range, char **uplo, int *n, Rcomplex *ap,
  double *vl, double *vu, int *il, int *iu, double *abstol, int *m,
  double *w, Rcomplex *z, int *ldz, Rcomplex *work, double *rwork,
  int *iwork, int *ifail, int *info) {

extern void F77_NAME(zhpevx)(const char *jobz, const char *range, const 
char *uplo,
const int *n, __complex__ double *Cap, const double *vl,
const double *vu, const int *il, const int *iu,
const double *abstol, int *m, double *w,
__complex__ double *Cz, const int *ldz, __complex__ double *Cwork, double 
*rwork,
int *iwork, int *ifail, int *info);

const char *JOBZ = jobz[0];
const char *RANGE = range[0];
const char *UPLO = uplo[0];

double _Complex *Cap;
memcpy(&ap, &Cap, sizeof(ap));
double _Complex *Cz;
memcpy(&z, &Cz, sizeof(z));
double _Complex *Cwork;
memcpy(&work, &Cwork, sizeof(work));

F77_CALL(zhpevx)(JOBZ, RANGE, UPLO, &n[0], Cap, &vl[0], &vu[0], &il[0], 
&iu[0], &abstol[0], &m[0],
w, Cz, &ldz[0], Cwork, rwork, iwork, ifail, &info[0]);

delete[] JOBZ;
delete[] RANGE;
delete[] UPLO;

delete[] Cap;
delete[] Cz;
delete[] Cwork;
  }

}
#endif



From: Tomas Kalibera 
Sent: Tuesday, 15 December 2020 23:01
To: Pierre Lafaye de Micheaux 
Cc: r-package-devel@r-project.org 
Subject: Re: [R-pkg-devel] warning: type of ‘zhpevx_’ does not match original 
declaration [-Wlto-type-mismatch]

Dear Pierre,

your code checks fine on my Ubuntu 20.04 (gcc/gfortran 9.3), but I can 
r

Re: [R-pkg-devel] Used package not updated - needs java < V 11

2020-12-17 Thread Knut Krueger

Am 16.12.20 um 16:57 schrieb Duncan Murdoch:

> No, you should drop the @importFrom comment completely, and in your R
> code use those fully qualified forms.
>
> Duncan Murdoch
Sorry I did not read carefully

Knut

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Used package not updated - needs java < V 11

2020-12-17 Thread Knut Krueger





   Also, I received an email from CRAN maintainers months ago saying 
that "gdata" was being obsoleted.  It's still on CRAN with a date of 
2017-06-06 and a huge number of reverse dependencies.  The CRAN 
maintainers may have gotten someone to agree to take it over who just 
hasn't finished fixing whatever deficiencies it has.  However, you might 
see how difficult it might be to do without "gdata" as well.



   Spencer Graves



   Is it permissible to copy the code from rename.vars (gdata) inside 
my package with an hint:


"  Function rename.vars Source code from gdata as gdata is unmaintained 
since 2017-06-06 Rename variables in a dataframe

Author(s)
Don MacQueen (package gdata), macq\@llnl.gov."

it is the only function I am using from gdata.


Knut

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Used package not updated - needs java < V 11

2020-12-17 Thread Spencer Graves




On 2020-12-17 12:22, Knut Krueger wrote:




   Also, I received an email from CRAN maintainers months ago 
saying that "gdata" was being obsoleted.  It's still on CRAN with a 
date of 2017-06-06 and a huge number of reverse dependencies.  The 
CRAN maintainers may have gotten someone to agree to take it over who 
just hasn't finished fixing whatever deficiencies it has.  However, 
you might see how difficult it might be to do without "gdata" as well.



   Spencer Graves



    Is it permissible to copy the code from rename.vars (gdata) inside 
my package with an hint:


"  Function rename.vars Source code from gdata as gdata is unmaintained 
since 2017-06-06 Rename variables in a dataframe

Author(s)
Don MacQueen (package gdata), macq\@llnl.gov."

it is the only function I am using from gdata.



	  That's definitely consistent with the GPL-2 license it carries as 
long as you aren't trying to charge royalties for use of your package. 
See:



https://CRAN.R-project.org/package=gdata


	  And the tarball is available at this link to make it easy for you to 
do that.



	  Regarding my earlier comment about to gdata being potentially 
obsoleted, I don't see evidence of that now:  There is only one "Note" 
in the "CRAN checks", and that would seem to be a problem more with that 
platform than with the gdata package.  The problem that Brian Ripley 
mentioned in his email about this 2020-09-11 may have been with 
something else that gdata used that has since been fixed.



	  I would normally prefer to let the gdata maintainers continue to 
maintain a function like this.  However, you seem to have a compelling 
reason for copying that function and only citing the original for the 
source of where you got the code.  You might include, e.g., 
\code{\link[gdata]{rename.vars}}.



	  CAVEAT:  Please ignore the above if contradicted by someone more 
knowledgeable than I am about CRAN and R policies and recommendations.



  Spencer



Knut

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Package Encoding and Literal Strings

2020-12-17 Thread jo...@jorisgoosen.nl
On Thu, 17 Dec 2020 at 18:22, Tomas Kalibera 
wrote:

> On 12/17/20 5:17 PM, jo...@jorisgoosen.nl wrote:
>
>
>
> On Thu, 17 Dec 2020 at 10:46, Tomas Kalibera 
> wrote:
>
>> On 12/16/20 11:07 PM, jo...@jorisgoosen.nl wrote:
>> > David,
>> >
>> > Thanks for the response!
>> >
>> > So the problem is a bit worse then just setting `encoding="UTF-8"` on
>> > functions like readLines.
>> > I'll describe our setup a bit:
>> > So we run R embedded in a separate executable and through a whole bunch
>> of
>> > C(++) magic get that to the main executable that runs the actual
>> interface.
>> > All the code that isn't R basically uses UTF-8. This works good and
>> we've
>> > made sure that all of our source code is encoded properly and I've
>> verified
>> > that for this particular problem at least my source file is definitely
>> > encoded in UTF-8 (Ive checked a hexdump).
>> >
>> > The simplest solution, that we initially took, to get R+Windows to
>> > cooperate with everything is to simply set the locale to "C" before
>> > starting R. That way R simply assumes UTF-8 is native and everything
>> worked
>> > splendidly. Until of course a file needs to be opened in R that contains
>> > some non-ASCII characters. I noticed the problem because a korean user
>> had
>> > hangul in his username and that broke everything. This because R was
>> trying
>> > to convert to a different locale than Windows was using.
>>
>> Setting locale to "C" does not make R assume UTF-8 is the native
>> encoding, there is no way to make UTF-8 the current native encoding in R
>> on the current builds of R on Windows. This is an old limitation of
>> Windows, only recently fixed by Microsoft in recent Windows 10 and with
>> UCRT Windows runtime (see my blog post [1] for more - to make R support
>> this we need a new toolchain to build R).
>>
>> If you set the locale to C encoding, you are telling R the native
>> encoding is C/POSIX (essentially ASCII), not UTF-8. Encoding-sensitive
>> operations, including conversions, including those conversions that
>> happen without user control e.g. for interacting with Windows, will
>> produce incorrect results (garbage) or in better case errors, warnings,
>> omitted, substituted or transliterated characters.
>>
>> In principle setting the encoding via locale is dangerous on Windows,
>> because Windows has two current encodings, not just one. By setting
>> locale you set the one used in the C runtime, but not the other one used
>> by the system calls. If all code (in R, packages, external libraries)
>> was perfect, this would still work as long as all strings used were
>> representable in both encodings. For other strings it won't work, and
>> then code is not perfect in this regard, it is usually written assuming
>> there is one current encoding, which common sense dictates should be the
>> case. With the recent UTF-8 support ([1]), one can switch both of these
>> to UTF-8.
>>
>
> Well, this is exactly why I want to get rid of the situation. But this
> messes up the output because everything else expects UTF-8 which is why I'm
> looking for some kind of solution.
>
>
>
>> > The solution I've now been working on is:
>> > I took the sourcecode of R 4.0.3 and changed the backend of "gettext" to
>> > add an `encoding="something something"` option. And a bit of extra stuff
>> > like `bind_textdomain_codeset` in case I need to tweak the
>> codeset/charset
>> > that gettext uses.
>> > I think I've got that working properly now and once I solve the problem
>> of
>> > the encoding in a pkg I will open a bugreport/feature-request and I'll
>> add
>> > a patch that implements it.
>>
>> A number of similar "shortcuts" have been added to R in the past, but
>> they may the code more complex, harder to maintain and use, and can't
>> realistically solve all of these problems, anyway. Strings will
>> eventually be assumed to be in what is the current native encoding by
>> the C library. In R, any external code R uses, or code R packages use.
>> Now that Microsoft finally is supporting UTF-8, the way to get out of
>> this is switching to UTF-8. This needs only small changes to R source
>> code compared to those "shortcuts" (or to using UTF-16LE). I'd be
>> against polluting the code with any more "shortcuts".
>>
>
> I think the addition of " bind_textdomain_codeset" is not strictly
> necessary and can be left out. Because I think setting an environment
> variable as "OUTPUT_CHARSET=UTF-8" gives the same result for us.
> The addition of the "encoding" option to the internal "do_gettext" is just
> a few lines of code and I also undid some duplication between do_gettext
> and do_ngettext. Which should make it easier to maintain. But all of that
> is moot if there is no way to keep the literal strings from sources in
> UTF-8 anyhow.
>
> Before starting on this I did actually read your blogpost about UTF-8
> several times and it seems like the best way forward. Not to mention it
> would make my life easier and me happier when I can s

Re: [R-pkg-devel] Package Encoding and Literal Strings

2020-12-17 Thread Tomas Kalibera
On 12/17/20 5:17 PM, jo...@jorisgoosen.nl wrote:
>
>
> On Thu, 17 Dec 2020 at 10:46, Tomas Kalibera  > wrote:
>
> On 12/16/20 11:07 PM, jo...@jorisgoosen.nl
>  wrote:
> > David,
> >
> > Thanks for the response!
> >
> > So the problem is a bit worse then just setting
> `encoding="UTF-8"` on
> > functions like readLines.
> > I'll describe our setup a bit:
> > So we run R embedded in a separate executable and through a
> whole bunch of
> > C(++) magic get that to the main executable that runs the actual
> interface.
> > All the code that isn't R basically uses UTF-8. This works good
> and we've
> > made sure that all of our source code is encoded properly and
> I've verified
> > that for this particular problem at least my source file is
> definitely
> > encoded in UTF-8 (Ive checked a hexdump).
> >
> > The simplest solution, that we initially took, to get R+Windows to
> > cooperate with everything is to simply set the locale to "C" before
> > starting R. That way R simply assumes UTF-8 is native and
> everything worked
> > splendidly. Until of course a file needs to be opened in R that
> contains
> > some non-ASCII characters. I noticed the problem because a
> korean user had
> > hangul in his username and that broke everything. This because R
> was trying
> > to convert to a different locale than Windows was using.
>
> Setting locale to "C" does not make R assume UTF-8 is the native
> encoding, there is no way to make UTF-8 the current native
> encoding in R
> on the current builds of R on Windows. This is an old limitation of
> Windows, only recently fixed by Microsoft in recent Windows 10 and
> with
> UCRT Windows runtime (see my blog post [1] for more - to make R
> support
> this we need a new toolchain to build R).
>
> If you set the locale to C encoding, you are telling R the native
> encoding is C/POSIX (essentially ASCII), not UTF-8.
> Encoding-sensitive
> operations, including conversions, including those conversions that
> happen without user control e.g. for interacting with Windows, will
> produce incorrect results (garbage) or in better case errors,
> warnings,
> omitted, substituted or transliterated characters.
>
> In principle setting the encoding via locale is dangerous on Windows,
> because Windows has two current encodings, not just one. By setting
> locale you set the one used in the C runtime, but not the other
> one used
> by the system calls. If all code (in R, packages, external libraries)
> was perfect, this would still work as long as all strings used were
> representable in both encodings. For other strings it won't work, and
> then code is not perfect in this regard, it is usually written
> assuming
> there is one current encoding, which common sense dictates should
> be the
> case. With the recent UTF-8 support ([1]), one can switch both of
> these
> to UTF-8.
>
>
> Well, this is exactly why I want to get rid of the situation. But this 
> messes up the output because everything else expects UTF-8 which is 
> why I'm looking for some kind of solution.
>
> > The solution I've now been working on is:
> > I took the sourcecode of R 4.0.3 and changed the backend of
> "gettext" to
> > add an `encoding="something something"` option. And a bit of
> extra stuff
> > like `bind_textdomain_codeset` in case I need to tweak the
> codeset/charset
> > that gettext uses.
> > I think I've got that working properly now and once I solve the
> problem of
> > the encoding in a pkg I will open a bugreport/feature-request
> and I'll add
> > a patch that implements it.
>
> A number of similar "shortcuts" have been added to R in the past, but
> they may the code more complex, harder to maintain and use, and can't
> realistically solve all of these problems, anyway. Strings will
> eventually be assumed to be in what is the current native encoding by
> the C library. In R, any external code R uses, or code R packages
> use.
> Now that Microsoft finally is supporting UTF-8, the way to get out of
> this is switching to UTF-8. This needs only small changes to R source
> code compared to those "shortcuts" (or to using UTF-16LE). I'd be
> against polluting the code with any more "shortcuts".
>
>
> I think the addition of " bind_textdomain_codeset" is not strictly 
> necessary and can be left out. Because I think setting an environment 
> variable as "OUTPUT_CHARSET=UTF-8" gives the same result for us.
> The addition of the "encoding" option to the internal "do_gettext" is 
> just a few lines of code and I also undid some duplication between 
> do_gettext and do_ngettext. Which should make it easier to maintain

Re: [R-pkg-devel] Package Encoding and Literal Strings

2020-12-17 Thread jo...@jorisgoosen.nl
Ps. I will try to have a go at using your experimental version to see if
that could help us out. If I run into trouble I will mail you personally.

On Thu, 17 Dec 2020 at 17:17, jo...@jorisgoosen.nl 
wrote:

>
>
> On Thu, 17 Dec 2020 at 10:46, Tomas Kalibera 
> wrote:
>
>> On 12/16/20 11:07 PM, jo...@jorisgoosen.nl wrote:
>> > David,
>> >
>> > Thanks for the response!
>> >
>> > So the problem is a bit worse then just setting `encoding="UTF-8"` on
>> > functions like readLines.
>> > I'll describe our setup a bit:
>> > So we run R embedded in a separate executable and through a whole bunch
>> of
>> > C(++) magic get that to the main executable that runs the actual
>> interface.
>> > All the code that isn't R basically uses UTF-8. This works good and
>> we've
>> > made sure that all of our source code is encoded properly and I've
>> verified
>> > that for this particular problem at least my source file is definitely
>> > encoded in UTF-8 (Ive checked a hexdump).
>> >
>> > The simplest solution, that we initially took, to get R+Windows to
>> > cooperate with everything is to simply set the locale to "C" before
>> > starting R. That way R simply assumes UTF-8 is native and everything
>> worked
>> > splendidly. Until of course a file needs to be opened in R that contains
>> > some non-ASCII characters. I noticed the problem because a korean user
>> had
>> > hangul in his username and that broke everything. This because R was
>> trying
>> > to convert to a different locale than Windows was using.
>>
>> Setting locale to "C" does not make R assume UTF-8 is the native
>> encoding, there is no way to make UTF-8 the current native encoding in R
>> on the current builds of R on Windows. This is an old limitation of
>> Windows, only recently fixed by Microsoft in recent Windows 10 and with
>> UCRT Windows runtime (see my blog post [1] for more - to make R support
>> this we need a new toolchain to build R).
>>
>> If you set the locale to C encoding, you are telling R the native
>> encoding is C/POSIX (essentially ASCII), not UTF-8. Encoding-sensitive
>> operations, including conversions, including those conversions that
>> happen without user control e.g. for interacting with Windows, will
>> produce incorrect results (garbage) or in better case errors, warnings,
>> omitted, substituted or transliterated characters.
>>
>> In principle setting the encoding via locale is dangerous on Windows,
>> because Windows has two current encodings, not just one. By setting
>> locale you set the one used in the C runtime, but not the other one used
>> by the system calls. If all code (in R, packages, external libraries)
>> was perfect, this would still work as long as all strings used were
>> representable in both encodings. For other strings it won't work, and
>> then code is not perfect in this regard, it is usually written assuming
>> there is one current encoding, which common sense dictates should be the
>> case. With the recent UTF-8 support ([1]), one can switch both of these
>> to UTF-8.
>>
>
> Well, this is exactly why I want to get rid of the situation. But this
> messes up the output because everything else expects UTF-8 which is why I'm
> looking for some kind of solution.
>
>
>
>> > The solution I've now been working on is:
>> > I took the sourcecode of R 4.0.3 and changed the backend of "gettext" to
>> > add an `encoding="something something"` option. And a bit of extra stuff
>> > like `bind_textdomain_codeset` in case I need to tweak the
>> codeset/charset
>> > that gettext uses.
>> > I think I've got that working properly now and once I solve the problem
>> of
>> > the encoding in a pkg I will open a bugreport/feature-request and I'll
>> add
>> > a patch that implements it.
>>
>> A number of similar "shortcuts" have been added to R in the past, but
>> they may the code more complex, harder to maintain and use, and can't
>> realistically solve all of these problems, anyway. Strings will
>> eventually be assumed to be in what is the current native encoding by
>> the C library. In R, any external code R uses, or code R packages use.
>> Now that Microsoft finally is supporting UTF-8, the way to get out of
>> this is switching to UTF-8. This needs only small changes to R source
>> code compared to those "shortcuts" (or to using UTF-16LE). I'd be
>> against polluting the code with any more "shortcuts".
>>
>
> I think the addition of " bind_textdomain_codeset" is not strictly
> necessary and can be left out. Because I think setting an environment
> variable as "OUTPUT_CHARSET=UTF-8" gives the same result for us.
> The addition of the "encoding" option to the internal "do_gettext" is just
> a few lines of code and I also undid some duplication between do_gettext
> and do_ngettext. Which should make it easier to maintain. But all of that
> is moot if there is no way to keep the literal strings from sources in
> UTF-8 anyhow.
>
> Before starting on this I did actually read your blogpost about UTF-8
> several times and it seems

Re: [R-pkg-devel] unexpected CRAN pretest failure

2020-12-17 Thread Joshua Ulrich
On Thu, Dec 17, 2020 at 7:25 AM Rossum, Bart-Jan van
 wrote:
>
> Dear Uwe,
>
> Thanks for your reaction.
> I installed the latest available R-devel version (2020-12-15 r79633) on my 
> own windows pc, but even then the checks pas cleanly.
> So, unfortunately I'm not able to reproduce the issue.
>
> To be sure, I also retried on Winbuilder, which has a slightly different 
> version (r79643), but there the issue is still present.
>
> Could there be something else I need to do to be able to reproduce this 
> locally?
>
I assume you used the pre-compiled version here:
https://cran.r-project.org/bin/windows/base/rdevel.html

That version probably lags behind the subversion repo for about a day.
So you need to build the latest R-devel from the latest source in the
subversion repo. It always has the latest version. Rocker and r-hub
help make this much easier than setting everything up locally on your
machine.

That said, it looks like the pre-compiled version is at r79643 now, so
you should be able to use it to reproduce the issue.

Best,
Josh


> Regards,
> Bart-Jan
>
> -Original Message-
> From: Uwe Ligges 
> Sent: Thursday, December 17, 2020 13:29
> To: Rossum, Bart-Jan van ; 
> r-package-devel@r-project.org
> Subject: Re: [R-pkg-devel] unexpected CRAN pretest failure
>
> This is form a change in R-devel.
> Use a more recent R-devel to reproduce the issue.
>
> Best,
> Uwe Ligges
>
> On 14.12.2020 15:02, Rossum, Bart-Jan van wrote:
> > Dear community,
> >
> > When trying to update my CRAN 
> > (https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcran.r-project.org%2Fweb%2Fpackages%2FstatgenGxE%2Findex.html&data=04%7C01%7Cbart-jan.vanrossum%40wur.nl%7C7cee644adb0a4aebb06008d8a28746e3%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637438049192516675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=XsyDDtwJk%2FXlDxxWxx2ylcx9RX%2BVOb9XWSZruwo42HY%3D&reserved=0)
> >  package in ran into unexpected error on Windows.
> > I had tested before submission on R-hub, which went fine, but CRAN 
> > complained, and this was confirmed on Winbuilder.
> > I noticed a slight difference in R-version used on CRAN/Winbuilder and 
> > R-hub.
> > However, the error itself seems to come from an lme4 function.
> > I'm quite clueless on how to debug/fix this.
> >
> > CRAN and Winbuilder:
> > R Under development (unstable) (2020-12-13 r79623)
> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwin-builder.r-project.org%2FnK0OMOQ378SI&data=04%7C01%7Cbart-jan.vanrossum%40wur.nl%7C7cee644adb0a4aebb06008d8a28746e3%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637438049192516675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WpzlD%2B7%2BsggCPKJUrjvV4ZchNF12tXAXfrVMjRlh10c%3D&reserved=0
> >
> > R-hub:
> > R Under development (unstable) (2020-11-30 r79529)
> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbuilder.r-hub.io%2Fstatus%2FstatgenGxE_1.0.4.tar.gz-fcc1e205a5fb4fd09559d301ee3502c9&data=04%7C01%7Cbart-jan.vanrossum%40wur.nl%7C7cee644adb0a4aebb06008d8a28746e3%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637438049192516675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aWev8qrAGU5yO%2FP%2FGqqb3EVqdYYkGq%2FYrUf0b2ZqrDw%3D&reserved=0
> >
> > Any pointers are appreciated,
> > Bart-Jan
> >
> > __
> > R-package-devel@r-project.org mailing list
> > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-package-devel&data=04%7C01%7Cbart-jan.vanrossum%40wur.nl%7C7cee644adb0a4aebb06008d8a28746e3%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637438049192516675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=VA%2BpMcbmLDsLsjeZj5ucESwPHfgylxG59HHYkr9MOiw%3D&reserved=0
> >
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel



--
Joshua Ulrich  |  about.me/joshuaulrich
FOSS Trading  |  www.fosstrading.com

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Package Encoding and Literal Strings

2020-12-17 Thread jo...@jorisgoosen.nl
On Thu, 17 Dec 2020 at 10:46, Tomas Kalibera 
wrote:

> On 12/16/20 11:07 PM, jo...@jorisgoosen.nl wrote:
> > David,
> >
> > Thanks for the response!
> >
> > So the problem is a bit worse then just setting `encoding="UTF-8"` on
> > functions like readLines.
> > I'll describe our setup a bit:
> > So we run R embedded in a separate executable and through a whole bunch
> of
> > C(++) magic get that to the main executable that runs the actual
> interface.
> > All the code that isn't R basically uses UTF-8. This works good and we've
> > made sure that all of our source code is encoded properly and I've
> verified
> > that for this particular problem at least my source file is definitely
> > encoded in UTF-8 (Ive checked a hexdump).
> >
> > The simplest solution, that we initially took, to get R+Windows to
> > cooperate with everything is to simply set the locale to "C" before
> > starting R. That way R simply assumes UTF-8 is native and everything
> worked
> > splendidly. Until of course a file needs to be opened in R that contains
> > some non-ASCII characters. I noticed the problem because a korean user
> had
> > hangul in his username and that broke everything. This because R was
> trying
> > to convert to a different locale than Windows was using.
>
> Setting locale to "C" does not make R assume UTF-8 is the native
> encoding, there is no way to make UTF-8 the current native encoding in R
> on the current builds of R on Windows. This is an old limitation of
> Windows, only recently fixed by Microsoft in recent Windows 10 and with
> UCRT Windows runtime (see my blog post [1] for more - to make R support
> this we need a new toolchain to build R).
>
> If you set the locale to C encoding, you are telling R the native
> encoding is C/POSIX (essentially ASCII), not UTF-8. Encoding-sensitive
> operations, including conversions, including those conversions that
> happen without user control e.g. for interacting with Windows, will
> produce incorrect results (garbage) or in better case errors, warnings,
> omitted, substituted or transliterated characters.
>
> In principle setting the encoding via locale is dangerous on Windows,
> because Windows has two current encodings, not just one. By setting
> locale you set the one used in the C runtime, but not the other one used
> by the system calls. If all code (in R, packages, external libraries)
> was perfect, this would still work as long as all strings used were
> representable in both encodings. For other strings it won't work, and
> then code is not perfect in this regard, it is usually written assuming
> there is one current encoding, which common sense dictates should be the
> case. With the recent UTF-8 support ([1]), one can switch both of these
> to UTF-8.
>

Well, this is exactly why I want to get rid of the situation. But this
messes up the output because everything else expects UTF-8 which is why I'm
looking for some kind of solution.



> > The solution I've now been working on is:
> > I took the sourcecode of R 4.0.3 and changed the backend of "gettext" to
> > add an `encoding="something something"` option. And a bit of extra stuff
> > like `bind_textdomain_codeset` in case I need to tweak the
> codeset/charset
> > that gettext uses.
> > I think I've got that working properly now and once I solve the problem
> of
> > the encoding in a pkg I will open a bugreport/feature-request and I'll
> add
> > a patch that implements it.
>
> A number of similar "shortcuts" have been added to R in the past, but
> they may the code more complex, harder to maintain and use, and can't
> realistically solve all of these problems, anyway. Strings will
> eventually be assumed to be in what is the current native encoding by
> the C library. In R, any external code R uses, or code R packages use.
> Now that Microsoft finally is supporting UTF-8, the way to get out of
> this is switching to UTF-8. This needs only small changes to R source
> code compared to those "shortcuts" (or to using UTF-16LE). I'd be
> against polluting the code with any more "shortcuts".
>

I think the addition of " bind_textdomain_codeset" is not strictly
necessary and can be left out. Because I think setting an environment
variable as "OUTPUT_CHARSET=UTF-8" gives the same result for us.
The addition of the "encoding" option to the internal "do_gettext" is just
a few lines of code and I also undid some duplication between do_gettext
and do_ngettext. Which should make it easier to maintain. But all of that
is moot if there is no way to keep the literal strings from sources in
UTF-8 anyhow.

Before starting on this I did actually read your blogpost about UTF-8
several times and it seems like the best way forward. Not to mention it
would make my life easier and me happier when I can stop worrying about
Windows/Dos codepages!
Thank you for your work on it indeed!

But my problem with that is that a number of people still use an older
version of windows and your solution won't work there. Which would mean
that 

Re: [R-pkg-devel] unexpected CRAN pretest failure

2020-12-17 Thread Rossum, Bart-Jan van
Dear Uwe,

Thanks for your reaction.
I installed the latest available R-devel version (2020-12-15 r79633) on my own 
windows pc, but even then the checks pas cleanly.
So, unfortunately I'm not able to reproduce the issue.

To be sure, I also retried on Winbuilder, which has a slightly different 
version (r79643), but there the issue is still present. 

Could there be something else I need to do to be able to reproduce this locally?

Regards,
Bart-Jan

-Original Message-
From: Uwe Ligges  
Sent: Thursday, December 17, 2020 13:29
To: Rossum, Bart-Jan van ; 
r-package-devel@r-project.org
Subject: Re: [R-pkg-devel] unexpected CRAN pretest failure

This is form a change in R-devel.
Use a more recent R-devel to reproduce the issue.

Best,
Uwe Ligges

On 14.12.2020 15:02, Rossum, Bart-Jan van wrote:
> Dear community,
> 
> When trying to update my CRAN 
> (https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcran.r-project.org%2Fweb%2Fpackages%2FstatgenGxE%2Findex.html&data=04%7C01%7Cbart-jan.vanrossum%40wur.nl%7C7cee644adb0a4aebb06008d8a28746e3%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637438049192516675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=XsyDDtwJk%2FXlDxxWxx2ylcx9RX%2BVOb9XWSZruwo42HY%3D&reserved=0)
>  package in ran into unexpected error on Windows.
> I had tested before submission on R-hub, which went fine, but CRAN 
> complained, and this was confirmed on Winbuilder.
> I noticed a slight difference in R-version used on CRAN/Winbuilder and R-hub.
> However, the error itself seems to come from an lme4 function.
> I'm quite clueless on how to debug/fix this.
> 
> CRAN and Winbuilder:
> R Under development (unstable) (2020-12-13 r79623)
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwin-builder.r-project.org%2FnK0OMOQ378SI&data=04%7C01%7Cbart-jan.vanrossum%40wur.nl%7C7cee644adb0a4aebb06008d8a28746e3%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637438049192516675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=WpzlD%2B7%2BsggCPKJUrjvV4ZchNF12tXAXfrVMjRlh10c%3D&reserved=0
> 
> R-hub:
> R Under development (unstable) (2020-11-30 r79529)
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbuilder.r-hub.io%2Fstatus%2FstatgenGxE_1.0.4.tar.gz-fcc1e205a5fb4fd09559d301ee3502c9&data=04%7C01%7Cbart-jan.vanrossum%40wur.nl%7C7cee644adb0a4aebb06008d8a28746e3%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637438049192516675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=aWev8qrAGU5yO%2FP%2FGqqb3EVqdYYkGq%2FYrUf0b2ZqrDw%3D&reserved=0
> 
> Any pointers are appreciated,
> Bart-Jan
> 
> __
> R-package-devel@r-project.org mailing list
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-package-devel&data=04%7C01%7Cbart-jan.vanrossum%40wur.nl%7C7cee644adb0a4aebb06008d8a28746e3%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637438049192516675%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=VA%2BpMcbmLDsLsjeZj5ucESwPHfgylxG59HHYkr9MOiw%3D&reserved=0
> 

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] unexpected CRAN pretest failure

2020-12-17 Thread Uwe Ligges

This is form a change in R-devel.
Use a more recent R-devel to reproduce the issue.

Best,
Uwe Ligges

On 14.12.2020 15:02, Rossum, Bart-Jan van wrote:

Dear community,

When trying to update my CRAN 
(https://cran.r-project.org/web/packages/statgenGxE/index.html) package in ran 
into unexpected error on Windows.
I had tested before submission on R-hub, which went fine, but CRAN complained, 
and this was confirmed on Winbuilder.
I noticed a slight difference in R-version used on CRAN/Winbuilder and R-hub.
However, the error itself seems to come from an lme4 function.
I'm quite clueless on how to debug/fix this.

CRAN and Winbuilder:
R Under development (unstable) (2020-12-13 r79623)
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwin-builder.r-project.org%2FnK0OMOQ378SI&data=04%7C01%7Cbart-jan.vanrossum%40wur.nl%7C0bf8357a13b84c807ad708d8a0308e40%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637435477692513415%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HqExR7SELQ7U32H67xF3ufPKxyTZm3162ZiGb6YetI8%3D&reserved=0

R-hub:
R Under development (unstable) (2020-11-30 r79529)
https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbuilder.r-hub.io%2Fstatus%2FstatgenGxE_1.0.4.tar.gz-fcc1e205a5fb4fd09559d301ee3502c9&data=04%7C01%7Cbart-jan.vanrossum%40wur.nl%7Cce0d271db460494b5a6908d8a0248bc0%7C27d137e5761f4dc1af88d26430abb18f%7C0%7C0%7C637435426112472834%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=j2RrXotL3qz5r1O6gcfXxb1VH8WuM2UaWaYQ5SnQFz0%3D&reserved=0

Any pointers are appreciated,
Bart-Jan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] R CMD check error on Windows

2020-12-17 Thread Akshit Achara
Dear Uwe,

Thank you very much for the explanation.

Regards,
Akshit Achara



On Thu, Dec 17, 2020 at 3:32 PM Uwe Ligges 
wrote:

>
>
> On 17.12.2020 10:39, Akshit Achara wrote:
> > Dear Uwe,
> >
> > Can you please elaborate?
> >
> > I generated the configure script from configure.ac 
>
> > on Linux
> > and copied it's contents to configure.win (changed ~! /bin/sh
> > to #! /bin/bash). Does it cause any quoting issues?
>
> Well,  R > 4.0.0 uses a different toolchain on Windows, so likely the
> reason is the change in the toolchain.
>
> I would not worry too much, as it works for the new toolchain and frm R
>  >= 4.0.0.
>
> Best,
> Uwe Ligges
>
>
>
> > Thanks,
> > Akshit Achara
> >
> >
> >
> > On Thu, Dec 17, 2020 at 3:01 PM Uwe Ligges
> >  > > wrote:
> >
> >
> >
> > On 17.12.2020 10:26, Akshit Achara wrote:
> >  > Dear Sir,
> >  >
> >  > I got this error on rminizinc cran checks
> >  >
> > 
> for
> >  > r-oldrel-windows-ix86+x86_64:
> >  > (The package needs to run ./configure.win during installation)
> >  >
> >  > exec: /cygdrive/c/Program: not found
> >
> > Sounds like quoting issues as certainly /cygdrive/c/Program Files
>  is
> > meant here.
> >
> > Best,
> > Uwe Ligges
> >
> >  >
> >  > Warning in system("sh ./configure.win") : Exit code was 127
> >  >
> >  > ERROR: configuration failed for package 'rminizinc'
> >  >
> >  >
> >  > I know that the error occurred because sh command didn't work. I
> am
> >  > not getting any errors for other Windows flavors (devel and
> release).
> >  >
> >  > I wanted to ask if there is any solution for this error that can
> be
> >  > implemented from my end or should I add anything in the
> >  > SystemRequirements and resubmit to CRAN.
> >  >
> >  > Thanks,
> >  > Akshit Achara
> >  >
> >  >   [[alternative HTML version deleted]]
> >  >
> >  > __
> >  > R-package-devel@r-project.org
> >  mailing list
> >  > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >  >
> >
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] R CMD check error on Windows

2020-12-17 Thread Uwe Ligges




On 17.12.2020 10:39, Akshit Achara wrote:

Dear Uwe,

Can you please elaborate?

I generated the configure script from configure.ac  
on Linux

and copied it's contents to configure.win (changed ~! /bin/sh
to #! /bin/bash). Does it cause any quoting issues?


Well,  R > 4.0.0 uses a different toolchain on Windows, so likely the 
reason is the change in the toolchain.


I would not worry too much, as it works for the new toolchain and frm R 
>= 4.0.0.


Best,
Uwe Ligges




Thanks,
Akshit Achara



On Thu, Dec 17, 2020 at 3:01 PM Uwe Ligges 
> wrote:




On 17.12.2020 10:26, Akshit Achara wrote:
 > Dear Sir,
 >
 > I got this error on rminizinc cran checks
 >
 for
 > r-oldrel-windows-ix86+x86_64:
 > (The package needs to run ./configure.win during installation)
 >
 > exec: /cygdrive/c/Program: not found

Sounds like quoting issues as certainly /cygdrive/c/Program Files   is
meant here.

Best,
Uwe Ligges

 >
 > Warning in system("sh ./configure.win") : Exit code was 127
 >
 > ERROR: configuration failed for package 'rminizinc'
 >
 >
 > I know that the error occurred because sh command didn't work. I am
 > not getting any errors for other Windows flavors (devel and release).
 >
 > I wanted to ask if there is any solution for this error that can be
 > implemented from my end or should I add anything in the
 > SystemRequirements and resubmit to CRAN.
 >
 > Thanks,
 > Akshit Achara
 >
 >       [[alternative HTML version deleted]]
 >
 > __
 > R-package-devel@r-project.org
 mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-package-devel
 >



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Package Encoding and Literal Strings

2020-12-17 Thread Tomas Kalibera

On 12/16/20 11:07 PM, jo...@jorisgoosen.nl wrote:

David,

Thanks for the response!

So the problem is a bit worse then just setting `encoding="UTF-8"` on
functions like readLines.
I'll describe our setup a bit:
So we run R embedded in a separate executable and through a whole bunch of
C(++) magic get that to the main executable that runs the actual interface.
All the code that isn't R basically uses UTF-8. This works good and we've
made sure that all of our source code is encoded properly and I've verified
that for this particular problem at least my source file is definitely
encoded in UTF-8 (Ive checked a hexdump).

The simplest solution, that we initially took, to get R+Windows to
cooperate with everything is to simply set the locale to "C" before
starting R. That way R simply assumes UTF-8 is native and everything worked
splendidly. Until of course a file needs to be opened in R that contains
some non-ASCII characters. I noticed the problem because a korean user had
hangul in his username and that broke everything. This because R was trying
to convert to a different locale than Windows was using.


Setting locale to "C" does not make R assume UTF-8 is the native 
encoding, there is no way to make UTF-8 the current native encoding in R 
on the current builds of R on Windows. This is an old limitation of 
Windows, only recently fixed by Microsoft in recent Windows 10 and with 
UCRT Windows runtime (see my blog post [1] for more - to make R support 
this we need a new toolchain to build R).


If you set the locale to C encoding, you are telling R the native 
encoding is C/POSIX (essentially ASCII), not UTF-8. Encoding-sensitive 
operations, including conversions, including those conversions that 
happen without user control e.g. for interacting with Windows, will 
produce incorrect results (garbage) or in better case errors, warnings, 
omitted, substituted or transliterated characters.


In principle setting the encoding via locale is dangerous on Windows, 
because Windows has two current encodings, not just one. By setting 
locale you set the one used in the C runtime, but not the other one used 
by the system calls. If all code (in R, packages, external libraries) 
was perfect, this would still work as long as all strings used were 
representable in both encodings. For other strings it won't work, and 
then code is not perfect in this regard, it is usually written assuming 
there is one current encoding, which common sense dictates should be the 
case. With the recent UTF-8 support ([1]), one can switch both of these 
to UTF-8.



The solution I've now been working on is:
I took the sourcecode of R 4.0.3 and changed the backend of "gettext" to
add an `encoding="something something"` option. And a bit of extra stuff
like `bind_textdomain_codeset` in case I need to tweak the codeset/charset
that gettext uses.
I think I've got that working properly now and once I solve the problem of
the encoding in a pkg I will open a bugreport/feature-request and I'll add
a patch that implements it.


A number of similar "shortcuts" have been added to R in the past, but 
they may the code more complex, harder to maintain and use, and can't 
realistically solve all of these problems, anyway. Strings will 
eventually be assumed to be in what is the current native encoding by 
the C library. In R, any external code R uses, or code R packages use. 
Now that Microsoft finally is supporting UTF-8, the way to get out of 
this is switching to UTF-8. This needs only small changes to R source 
code compared to those "shortcuts" (or to using UTF-16LE). I'd be 
against polluting the code with any more "shortcuts".



The problem I'm stuck with now is simply this:
I have an R pkg here that I want to test the translations with and the code
is definitely saved as UTF-8, the package has "Encoding: UTF-8" in the
DESCRIPTION and it all loads and works. The particular problem I have is
that the R code contains literally: `mathotString <- "Mathôt!"`
The actual file contains the hexadecimal representation of ô as proper
utf-8: "0xC3 0xB4" but R turns it into: "0xf4".
Seemingly on loading the package, because I haven't done anything with it
except put it in my debug c-function to print its contents as
hexadecimals...

The only thing I want to achieve here is that when R loads the package it
keeps those strings in their original UTF-8 encoding, without converting it
to "native" or the strange unicode codepoint it seemingly placed in there
instead. Because otherwise I cannot get gettext to work fully in UTF-8 mode.

Is this already possible in R?


In principle, working with strings not representable in the current 
encoding is not reliable (and never will be). It can still work in some 
specific cases and uses. Parsing a UTF-8 string literal from a file, 
with correctly declared encoding as documented in WRE, should work at 
least in single-byte encodings. But what happens after that string is 
parsed is another thing. The parsing is based i

Re: [R-pkg-devel] R CMD check error on Windows

2020-12-17 Thread Akshit Achara
Dear Uwe,

Can you please elaborate?

I generated the configure script from configure.ac on Linux
and copied it's contents to configure.win (changed ~! /bin/sh
to #! /bin/bash). Does it cause any quoting issues?

Thanks,
Akshit Achara



On Thu, Dec 17, 2020 at 3:01 PM Uwe Ligges 
wrote:

>
>
> On 17.12.2020 10:26, Akshit Achara wrote:
> > Dear Sir,
> >
> > I got this error on rminizinc cran checks
> >  for
> > r-oldrel-windows-ix86+x86_64:
> > (The package needs to run ./configure.win during installation)
> >
> > exec: /cygdrive/c/Program: not found
>
> Sounds like quoting issues as certainly /cygdrive/c/Program Files   is
> meant here.
>
> Best,
> Uwe Ligges
>
> >
> > Warning in system("sh ./configure.win") : Exit code was 127
> >
> > ERROR: configuration failed for package 'rminizinc'
> >
> >
> > I know that the error occurred because sh command didn't work. I am
> > not getting any errors for other Windows flavors (devel and release).
> >
> > I wanted to ask if there is any solution for this error that can be
> > implemented from my end or should I add anything in the
> > SystemRequirements and resubmit to CRAN.
> >
> > Thanks,
> > Akshit Achara
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-package-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] R CMD check error on Windows

2020-12-17 Thread Uwe Ligges




On 17.12.2020 10:26, Akshit Achara wrote:

Dear Sir,

I got this error on rminizinc cran checks
 for
r-oldrel-windows-ix86+x86_64:
(The package needs to run ./configure.win during installation)

exec: /cygdrive/c/Program: not found


Sounds like quoting issues as certainly /cygdrive/c/Program Files   is 
meant here.


Best,
Uwe Ligges



Warning in system("sh ./configure.win") : Exit code was 127

ERROR: configuration failed for package 'rminizinc'


I know that the error occurred because sh command didn't work. I am
not getting any errors for other Windows flavors (devel and release).

I wanted to ask if there is any solution for this error that can be
implemented from my end or should I add anything in the
SystemRequirements and resubmit to CRAN.

Thanks,
Akshit Achara

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] R CMD check error on Windows

2020-12-17 Thread Akshit Achara
Dear Sir,

I got this error on rminizinc cran checks
 for
r-oldrel-windows-ix86+x86_64:
(The package needs to run ./configure.win during installation)

exec: /cygdrive/c/Program: not found

Warning in system("sh ./configure.win") : Exit code was 127

ERROR: configuration failed for package 'rminizinc'


I know that the error occurred because sh command didn't work. I am
not getting any errors for other Windows flavors (devel and release).

I wanted to ask if there is any solution for this error that can be
implemented from my end or should I add anything in the
SystemRequirements and resubmit to CRAN.

Thanks,
Akshit Achara

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel