Re: [R] melt function chooses wrong id variable with large datasets

2015-04-17 Thread PIKAL Petr
Yes

It could be. But anyway, if you wanted to melt your frame and be sure to have 
norm column added to months column you shall use

melt(dataset, id.vars=NULL, na.rm=TRUE)

construction.

Without it and considering data you sent I get

> head(dd.m0)
norm variable value
1   45.8713463281901  januari  38.1
2 24.047250681782984  januari  32.4
3 3.7533684144746324  januari  34.5
4 38.594241119279324  januari  20.7
5 26.391897460120358  januari  21.5
6 61.746470001194638  januari  23.1
...
without this id.vars=NULL parameter.

Cheers
Petr

PS. Keep your post to Rhelp - others may learn from it and anothers can provide 
you with more elaborated explanation.


From: Joachim Audenaert [mailto:joachim.audena...@pcsierteelt.be]
Sent: Friday, April 17, 2015 8:23 AM
To: PIKAL Petr
Subject: RE: [R] melt function chooses wrong id variable with large datasets

Hello,

I upgraded R tot version 3.1.3 and now everything in the script works 
perfectly. Could the troubles be due to the fact that I was running an older 
version?

Met vriendelijke groeten - With kind regards,

Joachim Audenaert
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research


Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be<mailto:joachim.audena...@pcsierteelt.be> | 
W: www.pcsierteelt.be<http://www.pcsierteelt.be/>



From:PIKAL Petr mailto:petr.pi...@precheza.cz>>
To:Joachim Audenaert 
mailto:joachim.audena...@pcsierteelt.be>>
Cc:"r-help@r-project.org<mailto:r-help@r-project.org>" 
mailto:r-help@r-project.org>>
Date:    16/04/2015 13:41
Subject:    RE: [R]  melt function chooses wrong id variable with large 
datasets




Hi

With this dataset I get

> dd.m0<-melt(dataset, na.rm=T)
Using norm as id variables
> head(dd.m0)
norm variable value
1   45.8713463281901  januari  38.1
2 24.047250681782984  januari  32.4
3 3.7533684144746324  januari  34.5
4 38.594241119279324  januari  20.7
5 26.391897460120358  januari  21.5
6 61.746470001194638  januari  23.1
>
or

dd.m<-melt(dataset, id.vars=NULL, na.rm=T)

> head(dd.m)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m)
variable  value
255 norm  4.856812959269508
256 norm 5.3982910143166514
257 norm 46.553976273304215
258 norm 17.566272518985429
259 norm 20.552451905814117
260 norm 61.894775704479279

The latter will put norm to the same column as months. Is it intended?

Maybe you want

> dd.m1<-melt(dataset[,-13], na.rm=T)
No id variables; using all as measure variables
> head(dd.m1)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m1)
variable value
235 december  20.7
236 december  30.9
237 december  36.2
238 december  21.0
239 december  20.2
240 december  21.3

Cheers
Petr

From: Joachim Audenaert [mailto:joachim.audena...@pcsierteelt.be]
Sent: Thursday, April 16, 2015 1:13 PM
To: PIKAL Petr
Cc: r-help@r-project.org<mailto:r-help@r-project.org>
Subject: RE: [R] melt function chooses wrong id variable with large datasets

Hello,

This is a part of my dataset:

structure(list(januari = c(38.1, 32.4, 34.5, 20.7, 21.5, 23.1,
29.7, 36.6, 36.1, 20.6, 20.4, 30.1, 38.7, 41.4, 37, 36, 37, 38,
23, 26.7), februari = c(31.5, 36.2, 38.2, 26.4, 20.9, 21.5, 30.2,
33.4, 32.6, 22.2, 21.7, 30, 35.7, 32.8, 39.3, 25.5, 23, 19.9,
21.3, 20.8), maart = c(34.2, 27, 24.2, 19.9, 19.7, 21.5, 30.6,
30, 19, 19.6, 20.6, 23.6, 17.9, 17.3, 21.4, 24.1, 20.9, 30.1,
32.6, 21.3), april = c(26.3, 29.6, 30.3, 23.6, 28.4, 20.7, 24.1,
27.3, 23.2, 18.3, 24.6, 27.4, 20.4, 18.1, 25.2, 19.8, 21, 23.7,
19.6, 18.1), mei = c(23.7, 24, 17.2, 23.2, 25.2, 17.2, 16, 15.6,
13.4, 16, 16.8, 14.6, 19.4, 21, 19.5, 18.5, 13.3, 13.7, 14.3,
14.1), juni = c(17.7, 14.2, 16.6, 15.7, 13.7, 14.7, 13.1, 12.9,
15.4, 11.9, 15.2, 15.3, 16.5, 16.1, 11.7, 11.2, 11.5, 10.8, 16.1,
14.8), juli = c(15.7, 14.5, 10.8, 10.5, 13.4, 12.2, 13.2, 13,
12.4, 13.1, 9.8, 10.5, 13.4, 11, 13.1, 15, 16.7, 16.1, 18.2,
15.7), augustus = c(12.9, 12.8, 15.2, 14.5, 17.2, 14.5, 14.4,
11, 13.1, 13.6, 14.6, 12.7, 13.6, 12.7, 15.5, 17.4, 15.2, 14.2,
17.7, 19.2), september = c(15.6, 15.5, 15.9, 15.1, 16, 19.4,
21.5, 23.7, 18.7, 23.8, 18, 16.2, 18.5, 20.6, 18.3, 22.5, 26.9,
19.4, 15.9, 20.5), oktober = c(21.4, 20.8, 14, 17, 23, 26.4,
19.6, 22.7, 26.9, 14.7, 15.2, 19.8, 26.9, 20.2, 14.3, 14.8, 18.5,
21.7, 21.4, 21.8), november = c(24.7, 26.2, 29, 21.6, 17.1, 16.9,
19.1, 24.7, 25.4, 19.8, 18.2, 16.3, 17, 17.7, 15.5, 14.7, 15.8,
19.9, 20.4, 23.3), december = c(19.8, 27, 21, 33, 22.6, 28.3,
21.1, 19, 17.3, 27, 30.2, 24.8, 17.9, 1

Re: [R] melt function chooses wrong id variable with large datasets

2015-04-16 Thread Jeff Newmiller
Maybe what you really want is the ?stack function.
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On April 16, 2015 4:59:47 AM PDT, Joachim Audenaert 
 wrote:
>Thanks,
>
>indeed norm should be in the same group as as the months. everything
>works 
>fine when the number of data is quite small, but with big datasets (15
>000 
>values) things seem to go wrong and I can't explain why. It puts norm
>as 
>an individual column in stead of in the group of months as it does when
>
>the dataset is small.
>
>Met vriendelijke groeten - With kind regards,
>
>Joachim Audenaert 
>onderzoeker gewasbescherming - crop protection researcher
>
>PCS | proefcentrum voor sierteelt - ornamental plant research
>
>Schaessestraat 18, 9070 Destelbergen, Belgi�
>T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
>E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be 
>
>
>
>From:   PIKAL Petr 
>To: Joachim Audenaert 
>Cc: "r-help@r-project.org" 
>Date:   16/04/2015 13:41
>Subject:RE: [R]  melt function chooses wrong id variable with 
>large datasets
>
>
>
>Hi
> 
>With this dataset I get
> 
>> dd.m0<-melt(dataset, na.rm=T)
>Using norm as id variables
>> head(dd.m0)
>norm variable value
>1   45.8713463281901  januari  38.1
>2 24.047250681782984  januari  32.4
>3 3.7533684144746324  januari  34.5
>4 38.594241119279324  januari  20.7
>5 26.391897460120358  januari  21.5
>6 61.746470001194638  januari  23.1
>> 
>or
> 
>dd.m<-melt(dataset, id.vars=NULL, na.rm=T)
> 
>> head(dd.m)
>  variable value
>1  januari  38.1
>2  januari  32.4
>3  januari  34.5
>4  januari  20.7
>5  januari  21.5
>6  januari  23.1
>> tail(dd.m)
>variable  value
>255 norm  4.856812959269508
>256 norm 5.3982910143166514
>257 norm 46.553976273304215
>258 norm 17.566272518985429
>259 norm 20.552451905814117
>260 norm 61.894775704479279
> 
>The latter will put norm to the same column as months. Is it intended?
> 
>Maybe you want
> 
>> dd.m1<-melt(dataset[,-13], na.rm=T)
>No id variables; using all as measure variables
>> head(dd.m1)
>  variable value
>1  januari  38.1
>2  januari  32.4
>3  januari  34.5
>4  januari  20.7
>5  januari  21.5
>6  januari  23.1
>> tail(dd.m1)
>variable value
>235 december  20.7
>236 december  30.9
>237 december  36.2
>238 december  21.0
>239 december  20.2
>240 december  21.3
> 
>Cheers
>Petr
> 
>From: Joachim Audenaert [mailto:joachim.audena...@pcsierteelt.be] 
>Sent: Thursday, April 16, 2015 1:13 PM
>To: PIKAL Petr
>Cc: r-help@r-project.org
>Subject: RE: [R] melt function chooses wrong id variable with large 
>datasets
> 
>Hello, 
>
>This is a part of my dataset: 
>
>structure(list(januari = c(38.1, 32.4, 34.5, 20.7, 21.5, 23.1, 
>29.7, 36.6, 36.1, 20.6, 20.4, 30.1, 38.7, 41.4, 37, 36, 37, 38, 
>23, 26.7), februari = c(31.5, 36.2, 38.2, 26.4, 20.9, 21.5, 30.2, 
>33.4, 32.6, 22.2, 21.7, 30, 35.7, 32.8, 39.3, 25.5, 23, 19.9, 
>21.3, 20.8), maart = c(34.2, 27, 24.2, 19.9, 19.7, 21.5, 30.6, 
>30, 19, 19.6, 20.6, 23.6, 17.9, 17.3, 21.4, 24.1, 20.9, 30.1, 
>32.6, 21.3), april = c(26.3, 29.6, 30.3, 23.6, 28.4, 20.7, 24.1, 
>27.3, 23.2, 18.3, 24.6, 27.4, 20.4, 18.1, 25.2, 19.8, 21, 23.7, 
>19.6, 18.1), mei = c(23.7, 24, 17.2, 23.2, 25.2, 17.2, 16, 15.6, 
>13.4, 16, 16.8, 14.6, 19.4, 21, 19.5, 18.5, 13.3, 13.7, 14.3, 
>14.1), juni = c(17.7, 14.2, 16.6, 15.7, 13.7, 14.7, 13.1, 12.9, 
>15.4, 11.9, 15.2, 15.3, 16.5, 16.1, 11.7, 11.2, 11.5, 10.8, 16.1, 
>14.8), juli = c(15.7, 14.5, 10.8, 10.5, 13.4, 12.2, 13.2, 13, 
>12.4, 13.1, 9.8, 10.5, 13.4, 11, 13.1, 15, 16.7, 16.1, 18.2, 
>15.7), augustus = c(12.9, 12.8, 15.2, 14.5, 17.2, 14.5, 14.4, 
>11, 13.1, 13.6, 14.6, 12.7, 13.6, 12.7, 15.5, 17.4, 15.2, 14.2, 
>17.7, 19.2), september = c(15.6, 15.5, 15.9, 15.1, 16, 19.4, 
>21.5, 23.7, 18.7, 23.8, 18, 16.2, 18.5, 20.6, 18.3, 22.5, 26.9, 
>19.4, 15.9, 20.5), oktober = c(21.4, 20.8, 14, 17, 23, 26.4, 
>19.6, 22.7, 26.9, 14.7, 15.2, 19.8, 26.9, 20.2, 14.3, 14.8, 18.5, 
>21.7, 21.4, 21.8), november = c(24.7, 26.2, 29, 21.6, 17.1, 16.9, 
>19.1, 24.7, 25.4, 19.8, 18.2

Re: [R] melt function chooses wrong id variable with large datasets

2015-04-16 Thread Joachim Audenaert
Thanks,

indeed norm should be in the same group as as the months. everything works 
fine when the number of data is quite small, but with big datasets (15 000 
values) things seem to go wrong and I can't explain why. It puts norm as 
an individual column in stead of in the group of months as it does when 
the dataset is small.

Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be 



From:   PIKAL Petr 
To: Joachim Audenaert 
Cc: "r-help@r-project.org" 
Date:   16/04/2015 13:41
Subject:    RE: [R]  melt function chooses wrong id variable with 
large datasets



Hi
 
With this dataset I get
 
> dd.m0<-melt(dataset, na.rm=T)
Using norm as id variables
> head(dd.m0)
norm variable value
1   45.8713463281901  januari  38.1
2 24.047250681782984  januari  32.4
3 3.7533684144746324  januari  34.5
4 38.594241119279324  januari  20.7
5 26.391897460120358  januari  21.5
6 61.746470001194638  januari  23.1
> 
or
 
dd.m<-melt(dataset, id.vars=NULL, na.rm=T)
 
> head(dd.m)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m)
variable  value
255 norm  4.856812959269508
256 norm 5.3982910143166514
257 norm 46.553976273304215
258 norm 17.566272518985429
259 norm 20.552451905814117
260 norm 61.894775704479279
 
The latter will put norm to the same column as months. Is it intended?
 
Maybe you want
 
> dd.m1<-melt(dataset[,-13], na.rm=T)
No id variables; using all as measure variables
> head(dd.m1)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m1)
variable value
235 december  20.7
236 december  30.9
237 december  36.2
238 december  21.0
239 december  20.2
240 december  21.3
 
Cheers
Petr
 
From: Joachim Audenaert [mailto:joachim.audena...@pcsierteelt.be] 
Sent: Thursday, April 16, 2015 1:13 PM
To: PIKAL Petr
Cc: r-help@r-project.org
Subject: RE: [R] melt function chooses wrong id variable with large 
datasets
 
Hello, 

This is a part of my dataset: 

structure(list(januari = c(38.1, 32.4, 34.5, 20.7, 21.5, 23.1, 
29.7, 36.6, 36.1, 20.6, 20.4, 30.1, 38.7, 41.4, 37, 36, 37, 38, 
23, 26.7), februari = c(31.5, 36.2, 38.2, 26.4, 20.9, 21.5, 30.2, 
33.4, 32.6, 22.2, 21.7, 30, 35.7, 32.8, 39.3, 25.5, 23, 19.9, 
21.3, 20.8), maart = c(34.2, 27, 24.2, 19.9, 19.7, 21.5, 30.6, 
30, 19, 19.6, 20.6, 23.6, 17.9, 17.3, 21.4, 24.1, 20.9, 30.1, 
32.6, 21.3), april = c(26.3, 29.6, 30.3, 23.6, 28.4, 20.7, 24.1, 
27.3, 23.2, 18.3, 24.6, 27.4, 20.4, 18.1, 25.2, 19.8, 21, 23.7, 
19.6, 18.1), mei = c(23.7, 24, 17.2, 23.2, 25.2, 17.2, 16, 15.6, 
13.4, 16, 16.8, 14.6, 19.4, 21, 19.5, 18.5, 13.3, 13.7, 14.3, 
14.1), juni = c(17.7, 14.2, 16.6, 15.7, 13.7, 14.7, 13.1, 12.9, 
15.4, 11.9, 15.2, 15.3, 16.5, 16.1, 11.7, 11.2, 11.5, 10.8, 16.1, 
14.8), juli = c(15.7, 14.5, 10.8, 10.5, 13.4, 12.2, 13.2, 13, 
12.4, 13.1, 9.8, 10.5, 13.4, 11, 13.1, 15, 16.7, 16.1, 18.2, 
15.7), augustus = c(12.9, 12.8, 15.2, 14.5, 17.2, 14.5, 14.4, 
11, 13.1, 13.6, 14.6, 12.7, 13.6, 12.7, 15.5, 17.4, 15.2, 14.2, 
17.7, 19.2), september = c(15.6, 15.5, 15.9, 15.1, 16, 19.4, 
21.5, 23.7, 18.7, 23.8, 18, 16.2, 18.5, 20.6, 18.3, 22.5, 26.9, 
19.4, 15.9, 20.5), oktober = c(21.4, 20.8, 14, 17, 23, 26.4, 
19.6, 22.7, 26.9, 14.7, 15.2, 19.8, 26.9, 20.2, 14.3, 14.8, 18.5, 
21.7, 21.4, 21.8), november = c(24.7, 26.2, 29, 21.6, 17.1, 16.9, 
19.1, 24.7, 25.4, 19.8, 18.2, 16.3, 17, 17.7, 15.5, 14.7, 15.8, 
19.9, 20.4, 23.3), december = c(19.8, 27, 21, 33, 22.6, 28.3, 
21.1, 19, 17.3, 27, 30.2, 24.8, 17.9, 17.9, 20.7, 30.9, 36.2, 
21, 20.2, 21.3), norm = c("45.8713463281901", "24.047250681782984", 
"3.7533684144746324", "38.594241119279324", "26.391897460120358", 
"61.746470001194638", "6.8321020448487992", "11.933109250115226", 
"51.951891096493924", "37.424611852237945", "5.1587836676942374", 
"36.552835044409434", "31.781209673851027", "29.09146215582853", 
"4.856812959269508", "5.3982910143166514", "46.553976273304215", 
"17.566272518985429", "20.552451905814117", "61.894775704479279"
)), .Names = c("januari", "februari", "maart", "april", "mei", 
"juni", "juli", "augustus", "september", "oktober", "november", 
"december", "norm"), row.names = c(NA, 20L), class = "data.frame") 

I transfo

Re: [R] melt function chooses wrong id variable with large datasets

2015-04-16 Thread PIKAL Petr
Hi

With this dataset I get

> dd.m0<-melt(dataset, na.rm=T)
Using norm as id variables
> head(dd.m0)
norm variable value
1   45.8713463281901  januari  38.1
2 24.047250681782984  januari  32.4
3 3.7533684144746324  januari  34.5
4 38.594241119279324  januari  20.7
5 26.391897460120358  januari  21.5
6 61.746470001194638  januari  23.1
>
or

dd.m<-melt(dataset, id.vars=NULL, na.rm=T)

> head(dd.m)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m)
variable  value
255 norm  4.856812959269508
256 norm 5.3982910143166514
257 norm 46.553976273304215
258 norm 17.566272518985429
259 norm 20.552451905814117
260 norm 61.894775704479279

The latter will put norm to the same column as months. Is it intended?

Maybe you want

> dd.m1<-melt(dataset[,-13], na.rm=T)
No id variables; using all as measure variables
> head(dd.m1)
  variable value
1  januari  38.1
2  januari  32.4
3  januari  34.5
4  januari  20.7
5  januari  21.5
6  januari  23.1
> tail(dd.m1)
variable value
235 december  20.7
236 december  30.9
237 december  36.2
238 december  21.0
239 december  20.2
240 december  21.3

Cheers
Petr

From: Joachim Audenaert [mailto:joachim.audena...@pcsierteelt.be]
Sent: Thursday, April 16, 2015 1:13 PM
To: PIKAL Petr
Cc: r-help@r-project.org
Subject: RE: [R] melt function chooses wrong id variable with large datasets

Hello,

This is a part of my dataset:

structure(list(januari = c(38.1, 32.4, 34.5, 20.7, 21.5, 23.1,
29.7, 36.6, 36.1, 20.6, 20.4, 30.1, 38.7, 41.4, 37, 36, 37, 38,
23, 26.7), februari = c(31.5, 36.2, 38.2, 26.4, 20.9, 21.5, 30.2,
33.4, 32.6, 22.2, 21.7, 30, 35.7, 32.8, 39.3, 25.5, 23, 19.9,
21.3, 20.8), maart = c(34.2, 27, 24.2, 19.9, 19.7, 21.5, 30.6,
30, 19, 19.6, 20.6, 23.6, 17.9, 17.3, 21.4, 24.1, 20.9, 30.1,
32.6, 21.3), april = c(26.3, 29.6, 30.3, 23.6, 28.4, 20.7, 24.1,
27.3, 23.2, 18.3, 24.6, 27.4, 20.4, 18.1, 25.2, 19.8, 21, 23.7,
19.6, 18.1), mei = c(23.7, 24, 17.2, 23.2, 25.2, 17.2, 16, 15.6,
13.4, 16, 16.8, 14.6, 19.4, 21, 19.5, 18.5, 13.3, 13.7, 14.3,
14.1), juni = c(17.7, 14.2, 16.6, 15.7, 13.7, 14.7, 13.1, 12.9,
15.4, 11.9, 15.2, 15.3, 16.5, 16.1, 11.7, 11.2, 11.5, 10.8, 16.1,
14.8), juli = c(15.7, 14.5, 10.8, 10.5, 13.4, 12.2, 13.2, 13,
12.4, 13.1, 9.8, 10.5, 13.4, 11, 13.1, 15, 16.7, 16.1, 18.2,
15.7), augustus = c(12.9, 12.8, 15.2, 14.5, 17.2, 14.5, 14.4,
11, 13.1, 13.6, 14.6, 12.7, 13.6, 12.7, 15.5, 17.4, 15.2, 14.2,
17.7, 19.2), september = c(15.6, 15.5, 15.9, 15.1, 16, 19.4,
21.5, 23.7, 18.7, 23.8, 18, 16.2, 18.5, 20.6, 18.3, 22.5, 26.9,
19.4, 15.9, 20.5), oktober = c(21.4, 20.8, 14, 17, 23, 26.4,
19.6, 22.7, 26.9, 14.7, 15.2, 19.8, 26.9, 20.2, 14.3, 14.8, 18.5,
21.7, 21.4, 21.8), november = c(24.7, 26.2, 29, 21.6, 17.1, 16.9,
19.1, 24.7, 25.4, 19.8, 18.2, 16.3, 17, 17.7, 15.5, 14.7, 15.8,
19.9, 20.4, 23.3), december = c(19.8, 27, 21, 33, 22.6, 28.3,
21.1, 19, 17.3, 27, 30.2, 24.8, 17.9, 17.9, 20.7, 30.9, 36.2,
21, 20.2, 21.3), norm = c("45.8713463281901", "24.047250681782984",
"3.7533684144746324", "38.594241119279324", "26.391897460120358",
"61.746470001194638", "6.8321020448487992", "11.933109250115226",
"51.951891096493924", "37.424611852237945", "5.1587836676942374",
"36.552835044409434", "31.781209673851027", "29.09146215582853",
"4.856812959269508", "5.3982910143166514", "46.553976273304215",
"17.566272518985429", "20.552451905814117", "61.894775704479279"
)), .Names = c("januari", "februari", "maart", "april", "mei",
"juni", "juli", "augustus", "september", "oktober", "november",
"december", "norm"), row.names = c(NA, 20L), class = "data.frame")

I transform my dataset with the following script:

y <- melt(dataset,na.rm=TRUE)
variable <- y[,1]
value <- y[,2]

and can then perform a levene test as follows:

LEVENE <- leveneTest(value~variable,y)

When the dataset is small, lets say less than 100 values per column everything 
works great. I get the message:

No id variables; using all as measure variables

When the dataset is much bigger I get the following message

Using norm as id variables, why does this function pick norm as id variable? 
and how can I tell R that each column title is my variable


Met vriendelijke groeten - With kind regards,

Joachim Audenaert
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research


Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be<mailto:joachim.a

Re: [R] melt function chooses wrong id variable with large datasets

2015-04-16 Thread Joachim Audenaert
Hello,

This is a part of my dataset:

structure(list(januari = c(38.1, 32.4, 34.5, 20.7, 21.5, 23.1, 
29.7, 36.6, 36.1, 20.6, 20.4, 30.1, 38.7, 41.4, 37, 36, 37, 38, 
23, 26.7), februari = c(31.5, 36.2, 38.2, 26.4, 20.9, 21.5, 30.2, 
33.4, 32.6, 22.2, 21.7, 30, 35.7, 32.8, 39.3, 25.5, 23, 19.9, 
21.3, 20.8), maart = c(34.2, 27, 24.2, 19.9, 19.7, 21.5, 30.6, 
30, 19, 19.6, 20.6, 23.6, 17.9, 17.3, 21.4, 24.1, 20.9, 30.1, 
32.6, 21.3), april = c(26.3, 29.6, 30.3, 23.6, 28.4, 20.7, 24.1, 
27.3, 23.2, 18.3, 24.6, 27.4, 20.4, 18.1, 25.2, 19.8, 21, 23.7, 
19.6, 18.1), mei = c(23.7, 24, 17.2, 23.2, 25.2, 17.2, 16, 15.6, 
13.4, 16, 16.8, 14.6, 19.4, 21, 19.5, 18.5, 13.3, 13.7, 14.3, 
14.1), juni = c(17.7, 14.2, 16.6, 15.7, 13.7, 14.7, 13.1, 12.9, 
15.4, 11.9, 15.2, 15.3, 16.5, 16.1, 11.7, 11.2, 11.5, 10.8, 16.1, 
14.8), juli = c(15.7, 14.5, 10.8, 10.5, 13.4, 12.2, 13.2, 13, 
12.4, 13.1, 9.8, 10.5, 13.4, 11, 13.1, 15, 16.7, 16.1, 18.2, 
15.7), augustus = c(12.9, 12.8, 15.2, 14.5, 17.2, 14.5, 14.4, 
11, 13.1, 13.6, 14.6, 12.7, 13.6, 12.7, 15.5, 17.4, 15.2, 14.2, 
17.7, 19.2), september = c(15.6, 15.5, 15.9, 15.1, 16, 19.4, 
21.5, 23.7, 18.7, 23.8, 18, 16.2, 18.5, 20.6, 18.3, 22.5, 26.9, 
19.4, 15.9, 20.5), oktober = c(21.4, 20.8, 14, 17, 23, 26.4, 
19.6, 22.7, 26.9, 14.7, 15.2, 19.8, 26.9, 20.2, 14.3, 14.8, 18.5, 
21.7, 21.4, 21.8), november = c(24.7, 26.2, 29, 21.6, 17.1, 16.9, 
19.1, 24.7, 25.4, 19.8, 18.2, 16.3, 17, 17.7, 15.5, 14.7, 15.8, 
19.9, 20.4, 23.3), december = c(19.8, 27, 21, 33, 22.6, 28.3, 
21.1, 19, 17.3, 27, 30.2, 24.8, 17.9, 17.9, 20.7, 30.9, 36.2, 
21, 20.2, 21.3), norm = c("45.8713463281901", "24.047250681782984", 
"3.7533684144746324", "38.594241119279324", "26.391897460120358", 
"61.746470001194638", "6.8321020448487992", "11.933109250115226", 
"51.951891096493924", "37.424611852237945", "5.1587836676942374", 
"36.552835044409434", "31.781209673851027", "29.09146215582853", 
"4.856812959269508", "5.3982910143166514", "46.553976273304215", 
"17.566272518985429", "20.552451905814117", "61.894775704479279"
)), .Names = c("januari", "februari", "maart", "april", "mei", 
"juni", "juli", "augustus", "september", "oktober", "november", 
"december", "norm"), row.names = c(NA, 20L), class = "data.frame")

I transform my dataset with the following script:

y <- melt(dataset,na.rm=TRUE)
variable <- y[,1] 
value <- y[,2]

and can then perform a levene test as follows:

LEVENE <- leveneTest(value~variable,y)

When the dataset is small, lets say less than 100 values per column 
everything works great. I get the message: 

No id variables; using all as measure variables

When the dataset is much bigger I get the following message

Using norm as id variables, why does this function pick norm as id 
variable? and how can I tell R that each column title is my variable

 
Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be 



From:   PIKAL Petr 
To: Joachim Audenaert , 
"r-help@r-project.org" 
Date:   16/04/2015 12:13
Subject:RE: [R]  melt function chooses wrong id variable with 
large datasets



Hi

There is something weird with your data and melt function.

AFAIK melt does not use first row as id.variables.

What is result of

str(dataset)

Instead of

melt(dataset,id.vars=dataset[1,], na.rm=TRUE)

melt expects something like

melt(dataset, id.vars=c("norm, "jaar"), na.rm=TRUE)

If you want more specific answer you shall show us part of your data, 
preferably copy output of

dput(dataset[1:20,])

into your mail.

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Joachim
> Audenaert
> Sent: Thursday, April 16, 2015 11:37 AM
> To: r-help@r-project.org
> Subject: [R] melt function chooses wrong id variable with large
> datasets
>
> Hello all,
>
> I'm using a large dataset consisting of 2 groups of data, 2 columns in
> excel with a header (group name) and 15 000 rows of data. I would like
> like to compare this data, so I transform my dataset with the melt
> function to get 1 column of data and 1 column of ID variables, then I
> can apply different statistical tests. With small datasets this works
> great, the melt function automatically chooses the name in row 1 as ID
> variable and melts the data, thus giving me a matrix with all ID
&

Re: [R] melt function chooses wrong id variable with large datasets

2015-04-16 Thread PIKAL Petr
Hi

There is something weird with your data and melt function.

AFAIK melt does not use first row as id.variables.

What is result of

str(dataset)

Instead of

melt(dataset,id.vars=dataset[1,], na.rm=TRUE)

melt expects something like

melt(dataset, id.vars=c("norm, "jaar"), na.rm=TRUE)

If you want more specific answer you shall show us part of your data, 
preferably copy output of

dput(dataset[1:20,])

into your mail.

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Joachim
> Audenaert
> Sent: Thursday, April 16, 2015 11:37 AM
> To: r-help@r-project.org
> Subject: [R] melt function chooses wrong id variable with large
> datasets
>
> Hello all,
>
> I'm using a large dataset consisting of 2 groups of data, 2 columns in
> excel with a header (group name) and 15 000 rows of data. I would like
> like to compare this data, so I transform my dataset with the melt
> function to get 1 column of data and 1 column of ID variables, then I
> can apply different statistical tests. With small datasets this works
> great, the melt function automatically chooses the name in row 1 as ID
> variable and melts the data, thus giving me a matrix with all ID
> variables in column one and the data accordingly in column 2.
> With this big dataset however it chooses the whole first column as ID
> variables in stead of the first row. Is there a reason why this happens
> and how can I make sure the first row is chosen as ID variabele and the
> lower rows as data?
>
> If I specify that I want the first row to be the id variable I also get
> error.
>
> melt(dataset,id.vars=dataset[1,], na.rm=TRUE)
>
> Error: id variables not found in data: norm, jaar
>
> Are there alternative ways to create a good reshaped dataset?
>
> Met vriendelijke groeten - With kind regards,
>
> Joachim Audenaert
> onderzoeker gewasbescherming - crop protection researcher
>
> PCS | proefcentrum voor sierteelt - ornamental plant research
>
> Schaessestraat 18, 9070 Destelbergen, Belgi
> T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
> E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be
>
> Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? |
> Het PCS op LinkedIn Disclaimer | Please consider the environment before
> printing. Think green, keep it on the screen!
>   [[alternative HTML version deleted]]



Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective cont

[R] melt function chooses wrong id variable with large datasets

2015-04-16 Thread Joachim Audenaert
Hello all,

I'm using a large dataset consisting of 2 groups of data, 2 columns in 
excel with a header (group name) and 15 000 rows of data. I would like 
like to compare this data, so I transform my dataset with the melt 
function to get 1 column of data and 1 column of ID variables, then I can 
apply different statistical tests. With small datasets this works great, 
the melt function automatically chooses the name in row 1 as ID variable 
and melts the data, thus giving me a matrix with all ID variables in 
column one and the data accordingly in column 2. 
With this big dataset however it chooses the whole first column as ID 
variables in stead of the first row. Is there a reason why this happens 
and how can I make sure the first row is chosen as ID variabele and the 
lower rows as data? 

If I specify that I want the first row to be the id variable I also get 
error. 

melt(dataset,id.vars=dataset[1,], na.rm=TRUE)

Error: id variables not found in data: norm, jaar

Are there alternative ways to create a good reshaped dataset?

Met vriendelijke groeten - With kind regards,

Joachim Audenaert 
onderzoeker gewasbescherming - crop protection researcher

PCS | proefcentrum voor sierteelt - ornamental plant research

Schaessestraat 18, 9070 Destelbergen, Belgi�
T: +32 (0)9 353 94 71 | F: +32 (0)9 353 94 95
E: joachim.audena...@pcsierteelt.be | W: www.pcsierteelt.be   

Heb je je individuele begeleiding bemesting (CVBB) al aangevraagd? | Het 
PCS op LinkedIn
Disclaimer | Please consider the environment before printing. Think green, 
keep it on the screen!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.