Re: [Rd] Definition of uintptr_t in Rinterface.h

2017-01-01 Thread Laurent Gautier
My comment is about the definition of HAVE_UINTPTR_T in Rconfig.h. stdint.h
is coming with (g)libc, therefore unlikely to change/appear/disappear
(unless kernel and a bit of the OS changes), therefore may not be a
realistic concern. On the other hand mixing compilers is frequent, but this
is not doing much to prevent it.

2017-01-01 19:42 GMT-05:00 Simon Urbanek :

>
> > On Jan 1, 2017, at 5:12 PM, Laurent Gautier  wrote:
> >
> >
> >
> > 2017-01-01 8:28 GMT-05:00 Prof Brian Ripley :
> > On 29/12/2016 15:55, Simon Urbanek wrote:
> > The problem is elsewhere - Rinterface.h guards the ultima-ratio fallback
> with HAVE_UINTPTR_T but that config flag is not exported in Rconfig.h.
> Should be now fixed in R-devel - please check if that works for you.
> >
> > Rconfig.h would be appropriate if Rinterface.h is being included from C
> code using the same compiler as used for R.  But as Rinterface.h is
> intended for use by alternative front ends there is no guarantee that they
> use the same compiler (and some use C++).
> >
> > Isn't the changing libc/glibc not recommended anyway (without also
> changing to a matching kernel) ? If so, is this a realistic concern
> compared to the compiler version issues (mentioned by Dirk) ? In that case,
> what about simplifying the documentation and usage to "use the same
> compiler or undefined behaviour may occur"
> >
>
> Unfortunately people often mix up different compilers (note this has
> nothing to do with glibc or the kernel!) - mixing up C and C++ is very
> common. Also there are specialized compilers for some applications (MPI
> etc.). So, yes, it is a realistic concern that I've seen more often than
> you'd think.
>
>
> >
> > This was documented in the manual:
> >
> > 'Note that uintptr_t is a C99 type for which a substitute is defined in
> R, so your code needs to define HAVE_UINTPTR_T appropriately.'
> >
> > AFAICS if you comply, there will not be a conflict.
> >
> > Also note that is only an issue if CSTACK_DEFNS is defined, not the
> default and not mentioned here.
> >
> >
> >
> >
> > Thanks,
> > Simon
> >
> >
> >
> > On Dec 26, 2016, at 11:25 PM, Laurent Gautier 
> wrote:
> > (...)
> >
> > Is this expected ? Shouldn't R rely on the definition in stdint.h
> >
> > But there need not be one in stdint.h, as the type is optional in
> C99/C11/C++11 and likely not present in C++98.
> >
> > AFAIUI stdin.h is part of C99: https://en.wikipedia.org/wiki/
> C_standard_library#Header_files
> >
> > While at it, it is not exactly like C99 is the latest thing in town.
> Wouldn't relying on it give an opportunity to simplify the code base ?
> >
> >
> >
> >
> > Laurent
> >
> >
> >
> >
> > rather than define its own ?
> >
> >
> > (report for the issue:
> > https://bitbucket.org/rpy2/rpy2/issues/389/failed-to-
> compile-with-python-360-on-32
> > )
> >
> >
> > Laurent
> >
> >
> >
> > --
> > Brian D. Ripley,  rip...@stats.ox.ac.uk
> > Emeritus Professor of Applied Statistics, University of Oxford
> >
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of uintptr_t in Rinterface.h

2017-01-01 Thread Simon Urbanek

> On Jan 1, 2017, at 5:12 PM, Laurent Gautier  wrote:
> 
> 
> 
> 2017-01-01 8:28 GMT-05:00 Prof Brian Ripley :
> On 29/12/2016 15:55, Simon Urbanek wrote:
> The problem is elsewhere - Rinterface.h guards the ultima-ratio fallback with 
> HAVE_UINTPTR_T but that config flag is not exported in Rconfig.h. Should be 
> now fixed in R-devel - please check if that works for you.
> 
> Rconfig.h would be appropriate if Rinterface.h is being included from C code 
> using the same compiler as used for R.  But as Rinterface.h is intended for 
> use by alternative front ends there is no guarantee that they use the same 
> compiler (and some use C++).
> 
> Isn't the changing libc/glibc not recommended anyway (without also changing 
> to a matching kernel) ? If so, is this a realistic concern compared to the 
> compiler version issues (mentioned by Dirk) ? In that case, what about 
> simplifying the documentation and usage to "use the same compiler or 
> undefined behaviour may occur"
> 

Unfortunately people often mix up different compilers (note this has nothing to 
do with glibc or the kernel!) - mixing up C and C++ is very common. Also there 
are specialized compilers for some applications (MPI etc.). So, yes, it is a 
realistic concern that I've seen more often than you'd think.


> 
> This was documented in the manual:
> 
> 'Note that uintptr_t is a C99 type for which a substitute is defined in R, so 
> your code needs to define HAVE_UINTPTR_T appropriately.'
> 
> AFAICS if you comply, there will not be a conflict.
> 
> Also note that is only an issue if CSTACK_DEFNS is defined, not the default 
> and not mentioned here.
> 
> 
> 
> 
> Thanks,
> Simon
> 
> 
> 
> On Dec 26, 2016, at 11:25 PM, Laurent Gautier  wrote:
> (...)
> 
> Is this expected ? Shouldn't R rely on the definition in stdint.h
> 
> But there need not be one in stdint.h, as the type is optional in 
> C99/C11/C++11 and likely not present in C++98.
> 
> AFAIUI stdin.h is part of C99: 
> https://en.wikipedia.org/wiki/C_standard_library#Header_files 
> 
> While at it, it is not exactly like C99 is the latest thing in town. Wouldn't 
> relying on it give an opportunity to simplify the code base ?
> 
> 
> 
> 
> Laurent
> 
> 
> 
> 
> rather than define its own ?
> 
> 
> (report for the issue:
> https://bitbucket.org/rpy2/rpy2/issues/389/failed-to-compile-with-python-360-on-32
> )
> 
> 
> Laurent
> 
> 
> 
> -- 
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Emeritus Professor of Applied Statistics, University of Oxford
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of uintptr_t in Rinterface.h

2017-01-01 Thread Laurent Gautier
2017-01-01 8:28 GMT-05:00 Prof Brian Ripley :

> On 29/12/2016 15:55, Simon Urbanek wrote:
>
>> The problem is elsewhere - Rinterface.h guards the ultima-ratio fallback
>> with HAVE_UINTPTR_T but that config flag is not exported in Rconfig.h.
>> Should be now fixed in R-devel - please check if that works for you.
>>
>
> Rconfig.h would be appropriate if Rinterface.h is being included from C
> code using the same compiler as used for R.  But as Rinterface.h is
> intended for use by alternative front ends there is no guarantee that they
> use the same compiler (and some use C++).
>

Isn't the changing libc/glibc not recommended anyway (without also changing
to a matching kernel) ? If so, is this a realistic concern compared to the
compiler version issues (mentioned by Dirk) ? In that case, what about
simplifying the documentation and usage to "use the same compiler or
undefined behaviour may occur"


> This was documented in the manual:
>
> 'Note that uintptr_t is a C99 type for which a substitute is defined in R,
> so your code needs to define HAVE_UINTPTR_T appropriately.'
>
> AFAICS if you comply, there will not be a conflict.
>
> Also note that is only an issue if CSTACK_DEFNS is defined, not the
> default and not mentioned here.
>
>
>
>
> Thanks,
>> Simon
>>
>>
>>
>> On Dec 26, 2016, at 11:25 PM, Laurent Gautier  wrote:
>>> (...)
>>>
>>> Is this expected ? Shouldn't R rely on the definition in stdint.h
>>>
>>
> But there need not be one in stdint.h, as the type is optional in
> C99/C11/C++11 and likely not present in C++98.


AFAIUI stdin.h is part of C99:
https://en.wikipedia.org/wiki/C_standard_library#Header_files

While at it, it is not exactly like C99 is the latest thing in town.
Wouldn't relying on it give an opportunity to simplify the code base ?




Laurent



>
> rather than define its own ?
>>>
>>>
>>> (report for the issue:
>>> https://bitbucket.org/rpy2/rpy2/issues/389/failed-to-compile
>>> -with-python-360-on-32
>>> )
>>>
>>>
>>> Laurent
>>>
>>>
>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Emeritus Professor of Applied Statistics, University of Oxford
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of uintptr_t in Rinterface.h

2017-01-01 Thread Dirk Eddelbuettel

On 1 January 2017 at 13:28, Prof Brian Ripley wrote:
| On 29/12/2016 15:55, Simon Urbanek wrote:
| > The problem is elsewhere - Rinterface.h guards the ultima-ratio fallback 
with HAVE_UINTPTR_T but that config flag is not exported in Rconfig.h. Should 
be now fixed in R-devel - please check if that works for you.
| 
| Rconfig.h would be appropriate if Rinterface.h is being included from C 
| code using the same compiler as used for R.  But as Rinterface.h is 
| intended for use by alternative front ends there is no guarantee that 
| they use the same compiler (and some use C++).
| 
| This was documented in the manual:
| 
| 'Note that uintptr_t is a C99 type for which a substitute is defined in 
| R, so your code needs to define HAVE_UINTPTR_T appropriately.'
| 
| AFAICS if you comply, there will not be a conflict.
| 
| Also note that is only an issue if CSTACK_DEFNS is defined, not the 
| default and not mentioned here.

Relatedly: Any good way about "using the same compiler as used for R"?

I write one package exposing some internal R API for wider use
in a package (RApiSerialize, used eg by RcppRedis) and like that scheme. I
might do the same for date/time/timezone logic but this hinges on these
settings (among other things).

Any good ideas how to go about this?  So far I mostly use 'don't ask don't
tell' and rely on the compiler to actually fail when something doesn't hold.
We can probably do better...

Best wishes to all useRs and devRs in 2017.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of uintptr_t in Rinterface.h

2017-01-01 Thread Prof Brian Ripley

On 29/12/2016 15:55, Simon Urbanek wrote:

The problem is elsewhere - Rinterface.h guards the ultima-ratio fallback with 
HAVE_UINTPTR_T but that config flag is not exported in Rconfig.h. Should be now 
fixed in R-devel - please check if that works for you.


Rconfig.h would be appropriate if Rinterface.h is being included from C 
code using the same compiler as used for R.  But as Rinterface.h is 
intended for use by alternative front ends there is no guarantee that 
they use the same compiler (and some use C++).


This was documented in the manual:

'Note that uintptr_t is a C99 type for which a substitute is defined in 
R, so your code needs to define HAVE_UINTPTR_T appropriately.'


AFAICS if you comply, there will not be a conflict.

Also note that is only an issue if CSTACK_DEFNS is defined, not the 
default and not mentioned here.





Thanks,
Simon




On Dec 26, 2016, at 11:25 PM, Laurent Gautier  wrote:

Hi,

I was recently pointed out that a definition in Rinterface.h can be conflicting
with a definition in stdint.h:

/usr/include/R/Rinterface.h has:
typedef unsigned long uintptr_t;

/usr/include/stdint.h has:
typedef unsigned int uintptr_t;
(when 32bit platform complete definition is:

#if __WORDSIZE == 64
# ifndef __intptr_t_defined
typedef long intintptr_t;
#  define __intptr_t_defined
# endif
typedef unsigned long int   uintptr_t;
#else
# ifndef __intptr_t_defined
typedef int intptr_t;
#  define __intptr_t_defined
# endif
typedef unsigned intuintptr_t;
#endif

)

Is this expected ? Shouldn't R rely on the definition in stdint.h


But there need not be one in stdint.h, as the type is optional in 
C99/C11/C++11 and likely not present in C++98.



rather than define its own ?


(report for the issue:
https://bitbucket.org/rpy2/rpy2/issues/389/failed-to-compile-with-python-360-on-32
)


Laurent




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of uintptr_t in Rinterface.h

2016-12-29 Thread Laurent Gautier
Thanks for looking at it. Having  HAVE_UINTPTR_T defined in Rconfig.h
should fix the issue. Will the fix make it to R-3.3.3 (if that point
release is planned, or R-3.3.2-patched), or will it only be with R-3.4 ?


L.

PS: I am forwarding a thank you note to the reporter of the problem on the
rpy2 issue tracker.


2016-12-29 10:55 GMT-05:00 Simon Urbanek :

> The problem is elsewhere - Rinterface.h guards the ultima-ratio fallback
> with HAVE_UINTPTR_T but that config flag is not exported in Rconfig.h.
> Should be now fixed in R-devel - please check if that works for you.
>
> Thanks,
> Simon
>
>
>
> > On Dec 26, 2016, at 11:25 PM, Laurent Gautier 
> wrote:
> >
> > Hi,
> >
> > I was recently pointed out that a definition in Rinterface.h can be
> conflicting
> > with a definition in stdint.h:
> >
> > /usr/include/R/Rinterface.h has:
> > typedef unsigned long uintptr_t;
> >
> > /usr/include/stdint.h has:
> > typedef unsigned int uintptr_t;
> > (when 32bit platform complete definition is:
> >
> > #if __WORDSIZE == 64
> > # ifndef __intptr_t_defined
> > typedef long intintptr_t;
> > #  define __intptr_t_defined
> > # endif
> > typedef unsigned long int   uintptr_t;
> > #else
> > # ifndef __intptr_t_defined
> > typedef int intptr_t;
> > #  define __intptr_t_defined
> > # endif
> > typedef unsigned intuintptr_t;
> > #endif
> >
> > )
> >
> > Is this expected ? Shouldn't R rely on the definition in stdint.h
> > rather than define its own ?
> >
> >
> > (report for the issue:
> > https://bitbucket.org/rpy2/rpy2/issues/389/failed-to-
> compile-with-python-360-on-32
> > )
> >
> >
> > Laurent
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of uintptr_t in Rinterface.h

2016-12-29 Thread Simon Urbanek
The problem is elsewhere - Rinterface.h guards the ultima-ratio fallback with 
HAVE_UINTPTR_T but that config flag is not exported in Rconfig.h. Should be now 
fixed in R-devel - please check if that works for you.

Thanks,
Simon



> On Dec 26, 2016, at 11:25 PM, Laurent Gautier  wrote:
> 
> Hi,
> 
> I was recently pointed out that a definition in Rinterface.h can be 
> conflicting
> with a definition in stdint.h:
> 
> /usr/include/R/Rinterface.h has:
> typedef unsigned long uintptr_t;
> 
> /usr/include/stdint.h has:
> typedef unsigned int uintptr_t;
> (when 32bit platform complete definition is:
> 
> #if __WORDSIZE == 64
> # ifndef __intptr_t_defined
> typedef long intintptr_t;
> #  define __intptr_t_defined
> # endif
> typedef unsigned long int   uintptr_t;
> #else
> # ifndef __intptr_t_defined
> typedef int intptr_t;
> #  define __intptr_t_defined
> # endif
> typedef unsigned intuintptr_t;
> #endif
> 
> )
> 
> Is this expected ? Shouldn't R rely on the definition in stdint.h
> rather than define its own ?
> 
> 
> (report for the issue:
> https://bitbucket.org/rpy2/rpy2/issues/389/failed-to-compile-with-python-360-on-32
> )
> 
> 
> Laurent
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Definition of uintptr_t in Rinterface.h

2016-12-26 Thread Laurent Gautier
Hi,

I was recently pointed out that a definition in Rinterface.h can be conflicting
with a definition in stdint.h:

/usr/include/R/Rinterface.h has:
 typedef unsigned long uintptr_t;

/usr/include/stdint.h has:
 typedef unsigned int uintptr_t;
 (when 32bit platform complete definition is:

#if __WORDSIZE == 64
# ifndef __intptr_t_defined
typedef long intintptr_t;
#  define __intptr_t_defined
# endif
typedef unsigned long int   uintptr_t;
#else
# ifndef __intptr_t_defined
typedef int intptr_t;
#  define __intptr_t_defined
# endif
typedef unsigned intuintptr_t;
#endif

)

Is this expected ? Shouldn't R rely on the definition in stdint.h
rather than define its own ?


(report for the issue:
https://bitbucket.org/rpy2/rpy2/issues/389/failed-to-compile-with-python-360-on-32
)


Laurent

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of [[

2009-03-16 Thread Thomas Lumley

On Sun, 15 Mar 2009, Wacek Kusnierczyk wrote:


Stavros Macrakis wrote:


Well, that's one issue.  But another is that there should be a
specification addressed to users, who should not have to understand
internals.



this should really be taken seriously.



Well, the lack of such a specification is a documented bug (see the FAQ on bug 
reporting), and I think everyone agrees it would be useful, just not as useful 
as what they would have to stop doing to write it.  In fact, such a document 
may well have a higher priority than it deserves: people who would want that 
sort of documentation are overrepresented in R-core compared to the general R 
user community.

There was a panel talk at DSC2005 (yes, four years ago) on the possibilities 
for a joint R/S language standard. That would have provided an external 
stimulus and a framework for finding all the inconsistencies. It didn't really 
eventuate.

 -thomas

Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of [[

2009-03-16 Thread Wacek Kusnierczyk
somewhat one the side,

l = list(1)
   
l[[2]]
# error, index out of bounds

l[2][[1]]
# NULL

that is, we can't extract from l any element at an index exceeding the
list's length (if we could, it would have been NULL or some sort of
_NA_list), but we can extract a sublist at an index out of bounds, and
from that sublist extract the element (which is NULL, 'the _NA_list').

that's not necessarily wrong, but the item at index i (l[[i]]) is not
equivalent to the item in the sublist at index i.

vQ



Thomas Lumley wrote:
 On Sun, 15 Mar 2009, Stavros Macrakis wrote:

 The semantics of [ and [[ don't seem to be fully specified in the
 Reference manual.  In particular, I can't find where the following
 cases are covered:

 cc - c(1); ll - list(1)

 cc[3]
 [1] NA
 OK, RefMan says: If i is positive and exceeds length(x) then the
 corresponding selection is NA.

 dput(ll[3])
 list(NULL)
 ? i is positive and exceeds length(x); why isn't this list(NA)?

 I think some of these are because there are only NAs for character,
 logical, and the numeric types. There isn't an NA of list type.

 This one shouldn't be list(NA) - which NA would it use?  It should be
 some sort of list(_NA_list_) type, and list(NULL) is playing that role.


 ll[[3]]
 Error in list(1)[[3]] : subscript out of bounds
 ? Why does this return NA for an atomic vector, but give an error for
 a generic vector?

 Again, because there isn't an NA of generic vector type.

 cc[[3]] - 34; dput(cc)
 c(1, NA, 34)
 OK

 ll[[3]] - 34; dput(ll)
 list(1, NULL, 34)
 Why is second element NULL, not NA?
 And why is it OK to set an undefined ll[[3]], but not to get it?

 Same reason for NULL vs NA.  The fact that setting works may just be
 an inconsistency -- as you can see from previous discussions, R often
 does not effectively forbid code that shouldn't work -- or it may be
 bug-compatibility with some version of S or S-PLUS.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Definition of [[

2009-03-15 Thread Stavros Macrakis
The semantics of [ and [[ don't seem to be fully specified in the
Reference manual.  In particular, I can't find where the following
cases are covered:

 cc - c(1); ll - list(1)

 cc[3]
[1] NA
OK, RefMan says: If i is positive and exceeds length(x) then the
corresponding selection is NA.

 dput(ll[3])
list(NULL)
? i is positive and exceeds length(x); why isn't this list(NA)?

 ll[[3]]
Error in list(1)[[3]] : subscript out of bounds
? Why does this return NA for an atomic vector, but give an error for
a generic vector?

 cc[[3]] - 34; dput(cc)
c(1, NA, 34)
OK

ll[[3]] - 34; dput(ll)
list(1, NULL, 34)
Why is second element NULL, not NA?
And why is it OK to set an undefined ll[[3]], but not to get it?

I assume that these are features, not bugs, but I can't find
documentation for them.

-s

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of [[

2009-03-15 Thread Duncan Murdoch

On 15/03/2009 2:31 PM, Stavros Macrakis wrote:

The semantics of [ and [[ don't seem to be fully specified in the
Reference manual.  In particular, I can't find where the following
cases are covered:


cc - c(1); ll - list(1)



cc[3]

[1] NA
OK, RefMan says: If i is positive and exceeds length(x) then the
corresponding selection is NA.


dput(ll[3])

list(NULL)
? i is positive and exceeds length(x); why isn't this list(NA)?


Because the sentence you read was talking about simple vectors, and ll 
is presumably not a simple vector.  So what is a simple vector?  That is 
not explicitly defined, and it probably should be.  I think it is 
atomic vectors, except those with a class that has a method for [.





ll[[3]]

Error in list(1)[[3]] : subscript out of bounds
? Why does this return NA for an atomic vector, but give an error for
a generic vector?


cc[[3]] - 34; dput(cc)

c(1, NA, 34)
OK

ll[[3]] - 34; dput(ll)
list(1, NULL, 34)
Why is second element NULL, not NA?


NA is a length 1 atomic vector with a specific type matching the type of 
c.  It makes more sense in this context to put in a NULL, and return a 
list(NULL) for ll[3].



And why is it OK to set an undefined ll[[3]], but not to get it?


Lots of code grows vectors by setting elements beyond the end of them, 
so whether or not that's a good idea, it's not likely to change.


I think an argument could be made that ll[[toobig]] should return NULL 
rather than trigger an error, but on the other hand, the current 
behaviour allows the programmer to choose:  if you are assuming that a 
particular element exists, use ll[[element]], and R will tell you when 
your assumption is wrong.  If you aren't sure, use ll[element] and 
you'll get NA or list(NULL) if the element isn't there.



I assume that these are features, not bugs, but I can't find
documentation for them.


There is more documentation in the man page for Extract, but I think it 
is incomplete.  The most complete documentation is of course the source 
code, but it may not answer the question of what's intentional and 
what's accidental.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of [[

2009-03-15 Thread Stavros Macrakis
Duncan,

Thanks for the reply.

On Sun, Mar 15, 2009 at 4:43 PM, Duncan Murdoch murd...@stats.uwo.ca wrote:
 On 15/03/2009 2:31 PM, Stavros Macrakis wrote:

 dput(ll[3])
 list(NULL)
 ? i is positive and exceeds length(x); why isn't this list(NA)?

 Because the sentence you read was talking about simple vectors, and ll is
 presumably not a simple vector.  So what is a simple vector?  That is not
 explicitly defined, and it probably should be.  I think it is atomic
 vectors, except those with a class that has a method for [.

The three subsections of 3.4 Indexing are 3.4.1 Indexing by vectors,
3.4.2 Indexing matrices and arrays, 3.4.3 Indexing other structures,
and 3.4.4 Subset assignment, so the context seems to be saying that
simple vectors are those which are not matrices or arrays, and those
(other structures) which do not overload [.

Even if the definition of 'simple vector' were clarified to cover only
atomic vectors, I still can't find any text specifying that list(3)[5]
= lsit(NULL).

For that matter, it would leave the subscripting of important
built-ins such as factors and dates, etc. undefined. Obviously the
intuition is that vectors of factors or vectors of dates would do the
'same thing' as vectors of integers or of strings, but 3.4.3 doesn't
say what that thing is

 ll[[3]]

 Error in list(1)[[3]] : subscript out of bounds
 ? Why does this return NA for an atomic vector, but give an error for
 a generic vector?

 cc[[3]] - 34; dput(cc)

 c(1, NA, 34)
 OK

 ll[[3]] - 34; dput(ll)
 list(1, NULL, 34)
 Why is second element NULL, not NA?

 NA is a length 1 atomic vector with a specific type matching the type of c.
  It makes more sense in this context to put in a NULL, and return a
 list(NULL) for ll[3].

Understood that that's the rationale, but where is it documented?

Also, if that's the rationale, it seems to say that NULL is the
equivalent of NA for list elements, but in fact NULL does not function
like NA:

 is.na(NULL)
logical(0)
Warning message:
In is.na(NULL) : is.na() applied to non-(list or vector) of type 'NULL'
 is.na(list(NULL))
[1] FALSE

Indeed, NA seems to both up-convert and down-convert nicely to other
forms of NA:

 dput(as.integer(as.logical(c(TRUE,NA,TRUE
c(1L, NA, 1L)
 dput(as.logical(as.integer(c(TRUE,NA,TRUE
c(TRUE, NA, TRUE)

and are not converted to NULL when converted to generic vector:

 dput(as.list(c(TRUE,NA,TRUE)))
list(TRUE, NA, TRUE)

and NA is preserved when downconverting:

 dput(as.logical(as.list(c(TRUE,NA,23
c(TRUE, NA, TRUE)

But if you try to downconvert NULL, you get an error

 dput(as.integer(list(NULL)))
Error in isS4(x) : (list) object cannot be coerced to type 'integer'

So I don't see why NULL is the right way to represent NA, especially
since NULL is a perfectly good list element, distinct from NA.

 And why is it OK to set an undefined ll[[3]], but not to get it?

 Lots of code grows vectors by setting elements beyond the end of them, so
 whether or not that's a good idea, it's not likely to change.

I wasn't suggesting changing this.

 I think an argument could be made that ll[[toobig]] should return NULL
 rather than trigger an error, but on the other hand, the current behaviour
 allows the programmer to choose:  if you are assuming that a particular
 element exists, use ll[[element]], and R will tell you when your assumption
 is wrong.  If you aren't sure, use ll[element] and you'll get NA or
 list(NULL) if the element isn't there.

Yes, that could make sense, but why would it be true for ll[[toobig]]
but not cc[[toobig]]?

 I assume that these are features, not bugs, but I can't find
 documentation for them.

 There is more documentation in the man page for Extract, but I think it is
 incomplete.

Yes, I was looking at that man page, and I don't think it resolves any
of the above questions.

 The most complete documentation is of course the source code,
 but it may not answer the question of what's intentional and what's
 accidental.

Well, that's one issue.  But another is that there should be a
specification addressed to users, who should not have to understand
internals.

 -s

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of [[

2009-03-15 Thread Wacek Kusnierczyk
Stavros Macrakis wrote:

 Well, that's one issue.  But another is that there should be a
 specification addressed to users, who should not have to understand
 internals.
   

this should really be taken seriously.

vQ

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Definition of [[

2009-03-15 Thread Duncan Murdoch

Just a couple of inline comments down below:

On 15/03/2009 5:30 PM, Stavros Macrakis wrote:

Duncan,

Thanks for the reply.

On Sun, Mar 15, 2009 at 4:43 PM, Duncan Murdoch murd...@stats.uwo.ca wrote:

On 15/03/2009 2:31 PM, Stavros Macrakis wrote:



dput(ll[3])
list(NULL)
? i is positive and exceeds length(x); why isn't this list(NA)?

Because the sentence you read was talking about simple vectors, and ll is
presumably not a simple vector.  So what is a simple vector?  That is not
explicitly defined, and it probably should be.  I think it is atomic
vectors, except those with a class that has a method for [.


The three subsections of 3.4 Indexing are 3.4.1 Indexing by vectors,
3.4.2 Indexing matrices and arrays, 3.4.3 Indexing other structures,
and 3.4.4 Subset assignment, so the context seems to be saying that
simple vectors are those which are not matrices or arrays, and those
(other structures) which do not overload [.

Even if the definition of 'simple vector' were clarified to cover only
atomic vectors, I still can't find any text specifying that list(3)[5]
= lsit(NULL).

For that matter, it would leave the subscripting of important
built-ins such as factors and dates, etc. undefined. Obviously the
intuition is that vectors of factors or vectors of dates would do the
'same thing' as vectors of integers or of strings, but 3.4.3 doesn't
say what that thing is


ll[[3]]

Error in list(1)[[3]] : subscript out of bounds
? Why does this return NA for an atomic vector, but give an error for
a generic vector?


cc[[3]] - 34; dput(cc)

c(1, NA, 34)
OK

ll[[3]] - 34; dput(ll)
list(1, NULL, 34)
Why is second element NULL, not NA?

NA is a length 1 atomic vector with a specific type matching the type of c.
 It makes more sense in this context to put in a NULL, and return a
list(NULL) for ll[3].


Understood that that's the rationale, but where is it documented?

Also, if that's the rationale, it seems to say that NULL is the
equivalent of NA for list elements, but in fact NULL does not function
like NA:


is.na(NULL)

logical(0)
Warning message:
In is.na(NULL) : is.na() applied to non-(list or vector) of type 'NULL'

is.na(list(NULL))

[1] FALSE

Indeed, NA seems to both up-convert and down-convert nicely to other
forms of NA:


dput(as.integer(as.logical(c(TRUE,NA,TRUE

c(1L, NA, 1L)

dput(as.logical(as.integer(c(TRUE,NA,TRUE

c(TRUE, NA, TRUE)

and are not converted to NULL when converted to generic vector:


dput(as.list(c(TRUE,NA,TRUE)))

list(TRUE, NA, TRUE)

and NA is preserved when downconverting:


dput(as.logical(as.list(c(TRUE,NA,23

c(TRUE, NA, TRUE)

But if you try to downconvert NULL, you get an error


dput(as.integer(list(NULL)))

Error in isS4(x) : (list) object cannot be coerced to type 'integer'

So I don't see why NULL is the right way to represent NA, especially
since NULL is a perfectly good list element, distinct from NA.


And why is it OK to set an undefined ll[[3]], but not to get it?

Lots of code grows vectors by setting elements beyond the end of them, so
whether or not that's a good idea, it's not likely to change.


I wasn't suggesting changing this.


I think an argument could be made that ll[[toobig]] should return NULL
rather than trigger an error, but on the other hand, the current behaviour
allows the programmer to choose:  if you are assuming that a particular
element exists, use ll[[element]], and R will tell you when your assumption
is wrong.  If you aren't sure, use ll[element] and you'll get NA or
list(NULL) if the element isn't there.


Yes, that could make sense, but why would it be true for ll[[toobig]]
but not cc[[toobig]]?


But it is:

 cc - c(1)
 cc[[3]]
Error in cc[[3]] : subscript out of bounds


I assume that these are features, not bugs, but I can't find
documentation for them.



There is more documentation in the man page for Extract, but I think it is
incomplete.


Yes, I was looking at that man page, and I don't think it resolves any
of the above questions.


The most complete documentation is of course the source code,
but it may not answer the question of what's intentional and what's
accidental.


Well, that's one issue.  But another is that there should be a
specification addressed to users, who should not have to understand
internals.


I agree, but not so strongly that I will drop everything and write one.

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel