Re: [Rd] duplicated factor labels.

2017-06-23 Thread Paul Johnson
On Fri, Jun 23, 2017 at 7:20 AM, Uwe Ligges wrote: > > > On 23.06.2017 11:51, peter dalgaard wrote: >> >> Hmm, the danger in this is that duplicated factor levels _used_ to be >> allowed (i.e. multiple codes with the same level). Disallowing it is what >> broke read.spss() on some files, because S

Re: [Rd] duplicated factor labels.

2017-06-23 Thread Joris Meys
On Fri, Jun 23, 2017 at 2:20 PM, Uwe Ligges wrote: > > > > I had the chance to look at > 1300 SPSS files our consulting center > collected during the last 20 year, and in several hundred cases we found > such a problem that was copy & paste error and simply wrong. > Only in < 5 cases condensing s

Re: [Rd] duplicated factor labels.

2017-06-23 Thread Martin Maechler
> peter dalgaard > on Fri, 23 Jun 2017 11:51:05 +0200 writes: > Hmm, the danger in this is that duplicated factor levels _used_ to be allowed (i.e. multiple codes with the same level). Disallowing it is what broke read.spss() on some files, because SPSS's concept of value labels

Re: [Rd] duplicated factor labels.

2017-06-23 Thread Uwe Ligges
On 23.06.2017 11:51, peter dalgaard wrote: Hmm, the danger in this is that duplicated factor levels _used_ to be allowed (i.e. multiple codes with the same level). Disallowing it is what broke read.spss() on some files, because SPSS's concept of value labels is not 1-to-1 with factors. Real

Re: [Rd] duplicated factor labels.

2017-06-23 Thread peter dalgaard
Hmm, the danger in this is that duplicated factor levels _used_ to be allowed (i.e. multiple codes with the same level). Disallowing it is what broke read.spss() on some files, because SPSS's concept of value labels is not 1-to-1 with factors. Reallowing it with different semantics could be pr

Re: [Rd] duplicated factor labels.

2017-06-23 Thread Martin Maechler
> Martin Maechler > on Thu, 22 Jun 2017 11:43:59 +0200 writes: > Paul Johnson > on Fri, 16 Jun 2017 11:02:34 -0500 writes: >> On Fri, Jun 16, 2017 at 2:35 AM, Joris Meys wrote: >>> To extwnd on Martin 's explanation : >>> >>> In factor(), levels are the

Re: [Rd] duplicated factor labels.

2017-06-22 Thread Martin Maechler
> Paul Johnson > on Fri, 16 Jun 2017 11:02:34 -0500 writes: > On Fri, Jun 16, 2017 at 2:35 AM, Joris Meys wrote: >> To extwnd on Martin 's explanation : >> >> In factor(), levels are the unique input values and labels the unique output >> values. So the function

Re: [Rd] duplicated factor labels.

2017-06-16 Thread Joris Meys
Hi Paul, Now I see what you're getting at. I misread your original mail completely. So we definitely agree, and wholeheartedly even. The use case you just gave, is definitely in my top 5 of frustrations about R. I would like to be able to assign the same label to multiple levels without having to

Re: [Rd] duplicated factor labels.

2017-06-16 Thread Paul Johnson
On Fri, Jun 16, 2017 at 2:35 AM, Joris Meys wrote: > To extwnd on Martin 's explanation : > > In factor(), levels are the unique input values and labels the unique output > values. So the function levels() actually displays the labels. > Dear Joris I think we agree. Currently, factor insists bot

Re: [Rd] duplicated factor labels.

2017-06-16 Thread Joris Meys
To extwnd on Martin 's explanation : In factor(), levels are the unique input values and labels the unique output values. So the function levels() actually displays the labels. Cheers Joris On 15 Jun 2017 17:15, "Martin Maechler" wrote: > Paul Johnson > on Wed, 14 Jun 2017 19:00:

Re: [Rd] duplicated factor labels.

2017-06-15 Thread Martin Maechler
> Paul Johnson > on Wed, 14 Jun 2017 19:00:11 -0500 writes: > Dear R devel > I've been wondering about this for a while. I am sorry to ask for your > time, but can one of you help me understand this? > This concerns duplicated labels, not levels, in the factor functio

[Rd] duplicated factor labels.

2017-06-14 Thread Paul Johnson
Dear R devel I've been wondering about this for a while. I am sorry to ask for your time, but can one of you help me understand this? This concerns duplicated labels, not levels, in the factor function. I think it is hard to understand that factor() fails, but levels() after does not > x <- 1: