I was thinking about how one does things in a language that is properly
object-oriented versus R that makes various half-assed attempts at being such.
Clearly in some such languages you can make an object that is a wrapper that
allows you to save an item that is the main payload as well as anyth
Adrian,
This is an aside. I note in many machine-learning algorithms they actually do
something along the lines being discussed. They may take an item like a
paragraph of words or an email message and add thousands of columns with each
one being a Boolean specifying if a particular word is
Adrian,
Agreed. To do what you said hundreds of columns of data by doubling it is
indeed a pain just to get what you want. There are straightforward ways
especially if you use tidyverse packages rather than base R. Just a warning,
this message is a tad long for anyone not interested to sk
Commits 80162 and 80163 lock the base environment and namespace during
startup, leading to an error when attempting to directly assign anything
from within Rprofile.site. While this is intentional and good, the help
file has not been updated to reflect this change.
Startup.Rd ( Description, paragr
Hi all,
When first hearing about ALTREP I've wondered how it might be able to be
used to store special missing value information - how can we learn more
about implementing ALTREP classes? The idea of carrying around a "meaning
of my NAs" vector, as Gabe said, would be very interesting!
I've done
Hi All,
So there is a not particularly active, but closely curated (ie everything
on there should be good in terms of principled examples) github
organization of ALTREP examples: https://github.com/ALTREP-examples.
Currently there are two examples by Luke (including a package version of
the memory
On Mon, May 24, 2021 at 5:47 PM Gabriel Becker
wrote:
> Hi Adrian,
>
> I had the same thought as Luke. It is possible that you can develop an
> ALTREP that carries around the tagging information you're looking for in a
> way that is more persistent (in some cases) than R-level attributes and
> mo
Hi Adrian,
I had the same thought as Luke. It is possible that you can develop an
ALTREP that carries around the tagging information you're looking for in a
way that is more persistent (in some cases) than R-level attributes and
more hidden than additional user-visible columns.
The downsides to t
On Mon, May 24, 2021 at 4:40 PM Bertram, Alexander via R-devel <
r-devel@r-project.org> wrote:
> Dear Adrian,
> SPSS and other packages handle this problem in a very similar way to what I
> described: they store additional metadata for each variable. You can see
> this in the way that SPSS organiz
luke,
> PLEASE DO NOT DO THIS!
very happy to withdraw my offered alternative!
cheers, Greg
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
Hi Taras,
On Mon, May 24, 2021 at 4:20 PM Taras Zakharko
wrote:
> Hi Adrian,
>
> Have a look at vctrs package — they have low-level primitives that might
> simplify your life a bit. I think you can get quite far by creating a
> custom type that stores NAs in an attribute and utilizes vctrs proxy
Dear Adrian,
SPSS and other packages handle this problem in a very similar way to what I
described: they store additional metadata for each variable. You can see
this in the way that SPSS organizes it's file format: each "variable" has
additional metadata that indicate how specific values of the va
Hi Adrian,
Have a look at vctrs package — they have low-level primitives that might
simplify your life a bit. I think you can get quite far by creating a custom
type that stores NAs in an attribute and utilizes vctrs proxy functionality to
preserve these attributes across different operations.
On Mon, 24 May 2021, Adrian Dușa wrote:
On Mon, May 24, 2021 at 2:11 PM Greg Minshall wrote:
[...]
if you have 500 columns of possibly-NA'd variables, you could have one
column of 500 "bits", where each bit has one of N values, N being the
number of explanations the corresponding column has f
Dear Alex,
Thanks for piping in, I am learning with each new message.
The problem is clear, the solution escapes me though. I've already tried
the attributes route: it is going to triple the data size: along with the
additional (logical) variable that specifies which level is missing, one
also nee
On Mon, May 24, 2021 at 2:11 PM Greg Minshall wrote:
> [...]
> if you have 500 columns of possibly-NA'd variables, you could have one
> column of 500 "bits", where each bit has one of N values, N being the
> number of explanations the corresponding column has for why the NA
> exists.
>
The mere
Dear Adrian,
I just wanted to pipe in and underscore Thomas' point: the payload bits of
IEEE 754 floating point values are no place to store data that you care
about or need to keep. That is not only related to the R APIs, but also how
processors handle floating point values and signaling and non-s
On Mon, May 24, 2021 at 1:31 PM Tomas Kalibera
wrote:
> [...]
>
> For the reasons I explained, I would be against such a change. Keeping the
> data on the side, as also recommended by others on this list, would allow
> you for a reliable implementation. I don't want to support fragile package
> c
Adrian,
> If it was only one column then your solution is neat. But with 5-600
> variables, each of which can contain multiple missing values, to
> double this number of variables just to describe NA values seems to me
> excessive. Not to mention we should be able to quickly convert /
> import /
Hmm...
If it was only one column then your solution is neat. But with 5-600
variables, each of which can contain multiple missing values, to double
this number of variables just to describe NA values seems to me excessive.
Not to mention we should be able to quickly convert / import / export from
o
On 5/24/21 11:46 AM, Adrian Dușa wrote:
> On Sun, May 23, 2021 at 10:14 PM Tomas Kalibera
> mailto:tomas.kalib...@gmail.com>> wrote:
>
> [...]
>
> Good, but unfortunately the delineation between computation and
> non-computation is not always transparent. Even if an operation
> d
On Sun, May 23, 2021 at 10:14 PM Tomas Kalibera
wrote:
> [...]
>
> Good, but unfortunately the delineation between computation and
> non-computation is not always transparent. Even if an operation doesn't
> look like "computation" on the high-level, it may internally involve
> computation - so, r
22 matches
Mail list logo