Re: [R] comparition of occurrence of multiple variables between two dataframes

2017-09-13 Thread PIKAL Petr
Hi

Instead of posting head(data100) try to copy output of

dput(head(data100))

directly to your post.

This can show us your exact data together with their modes.

And switch to plain text emails, HTML formating results in quite messy mails.

Cheers
Petr


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Céline
> Lüscher
> Sent: Tuesday, September 12, 2017 1:58 PM
> To: Suzen, Mehmet 
> Cc: r-help@r-project.org
> Subject: Re: [R] comparition of occurrence of multiple variables between two
> dataframes
>
> Yes of course, I can share this short view of the datas.
>
> Here is the head() of data100, containing all the trees with a final value 
> higher
> than 100 :
>
> CV11
> CV12
> CV13
> CV14
> CV15
> CV21
> CV22
> CV23
> CV24
> CV25
> CV26
> CV31
> CV32
> CV33
> CV41
> CV42
> CV43
> CV44
> CV51
> CV52
> IN11
> IN12
> IN13
> 1291
> 0
> 0
> 0
> 1
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 1083
> 0
> 4
> 0
> 1
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 3919
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 2
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 14685
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 1
> 0
> 0
> 4021
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 1
> 0
> 0
> 5452
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
>
> IN14
> IN21
> IN22
> IN23
> IN31
> IN32
> IN33
> IN34
> BA11
> BA12
> BA21
> DE11
> DE12
> DE13
> DE14
> DE15
> GR11
> GR12
> GR13
> GR21
> GR22
> GR31
> GR32
> 1291
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 30
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 1083
> 3
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 3919
> 0
> 0
> 1
> 0
> 2
> 0
> 0
> 0
> 2
> 0
> 0
> 0
> 3
> 0
> 0
> 0
> 0
> 0
> 0
> 11
> 0
> 0
> 0
> 14685
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 11
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 4021
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 11
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 5452
> 0
> 0
> 1
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 2
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
>
> EP11
> EP12
> EP13
> EP14
> EP21
> EP31
> EP32
> EP33
> EP34
> EP35
> NE11
> NE12
> NE21
> OT11
> OT12
> OT21
> OT22
> ecoval
>
>
>
>
>
> 1291
> 0
> 8
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 1192
>
>
>
>
>
> 1083
> 0
> 8
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 424
>
>
>
>
>
> 3919
> 1
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 380
>
>
>
>
>
> 14685
> 0
> 1
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 370
>
>
>
>
>
> 4021
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 1
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 358
>
>
>
>
>
> 5452
> 0
> 0
> 0
> 0
> 0
> 0
> 1
> 0
> 0
> 11
> 0
> 0
> 0
> 0
> 1
> 0
> 0
> 356
>
>
>
>
> The columns are the possible structures found on a tree (cavity, scar…)
>
> And the same for the data0 :
>
> CV11
> CV12
> CV13
> CV14
> CV15
> CV21
> CV22
> CV23
> CV24
> CV25
> CV26
> CV31
> CV32
> CV33
> CV41
> CV42
> CV43
> CV44
> CV51
> CV52
> IN11
> IN12
> IN13
> 4728
> 0
> 0
> 0
> 1
> 0
> 0
> 0
> 3
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 1
> 1
> 0
> 

Re: [R] comparition of occurrence of multiple variables between two dataframes

2017-09-12 Thread Céline Lüscher
Yes of course, I can share this short view of the datas.

Here is the head() of data100, containing all the trees with a final value 
higher than 100 :

CV11
CV12
CV13
CV14
CV15
CV21
CV22
CV23
CV24
CV25
CV26
CV31
CV32
CV33
CV41
CV42
CV43
CV44
CV51
CV52
IN11
IN12
IN13
1291
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1083
0
4
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3919
0
0
0
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
0
0
0
0
14685
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
4021
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
5452
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0

IN14
IN21
IN22
IN23
IN31
IN32
IN33
IN34
BA11
BA12
BA21
DE11
DE12
DE13
DE14
DE15
GR11
GR12
GR13
GR21
GR22
GR31
GR32
1291
0
0
0
0
0
0
0
0
30
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1083
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3919
0
0
1
0
2
0
0
0
2
0
0
0
3
0
0
0
0
0
0
11
0
0
0
14685
0
0
0
0
0
0
0
0
11
0
0
0
0
0
0
0
0
0
0
0
0
0
0
4021
0
0
0
0
0
0
0
0
11
0
0
0
0
0
0
0
0
0
0
0
0
0
0
5452
0
0
1
0
0
0
0
0
0
0
0
2
0
0
0
0
0
0
0
0
0
0
0

EP11
EP12
EP13
EP14
EP21
EP31
EP32
EP33
EP34
EP35
NE11
NE12
NE21
OT11
OT12
OT21
OT22
ecoval





1291
0
8
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1192





1083
0
8
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
424





3919
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
380





14685
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
370





4021
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
358





5452
0
0
0
0
0
0
1
0
0
11
0
0
0
0
1
0
0
356






The columns are the possible structures found on a tree (cavity, scar…)

And the same for the data0 :

CV11
CV12
CV13
CV14
CV15
CV21
CV22
CV23
CV24
CV25
CV26
CV31
CV32
CV33
CV41
CV42
CV43
CV44
CV51
CV52
IN11
IN12
IN13
4728
0
0
0
1
0
0
0
3
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
5339
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
11766
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
796
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3561
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
10581
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0

IN14
IN21
IN22
IN23
IN31
IN32
IN33
IN34
BA11
BA12
BA21
DE11
DE12
DE13
DE14
DE15
GR11
GR12
GR13
GR21
GR22
GR31
GR32
4728
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
5339
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
11766
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
796
1
1
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
3561
0
0
0
0
0
0
0
0
3
0
0
0
0
0
0
0
0
0
0
0
0
0
0
10581
0
0
0
1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0

EP11
EP12
EP13
EP14
EP21
EP31
EP32
EP33
EP34
EP35
NE11
NE12
NE21
OT11
OT12
OT21
OT22
ecoval





4728
0
0
1
0
0
1
0
0
0
0
0
0
0
0
0
0
0
99





5339
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
99





11766
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
99





796
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
98





3561
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
98





10581
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
98







In short, the question is about to find a test to compare the occurrence of the 
structures between the group higher than 100 and the group under 100.

Thank you for your help,
C.



Gesendet von Mail für Windows 10

Von: Suzen, Mehmet
Gesendet: mardi, 12 septembre 2017 13:24
An: Céline Lüscher
Cc: r-help@r-project.org
Betreff: Re: [R] comparition of occurrence of multiple variables between two 
dataframes

Do you have a simplified example with a code? It is not clear to me
what do you mean by tree but if you refer to tree data structure,
maybe you could change the data structure to tree
(https://cran.r-project.org/web/packages/data.tree/vignettes/data.tree.html)
and try to write comparison of two tree objects. It might be easier
that data.frame alone.

On 12 September 2017 at 12:27, Céline Lüscher  wrote:
> Hi everyone, I need your help to solve a problem with occurrence and two 
> dataframes.
> I have an excel table of 15200 lines. Each line correspond to a tree analyzed 
> for its structures. I have all the structures in columns (48 structures). The 
> occurrence of these structures has been counted on every tree. For example, 
> the tree 12607 has 3 structures CV11, 1 structure IN12 and none (0) of the 
> rest of all the other structures. The very last column is the value given to 
> the tree, according to the structures found on it (each structure giving a 
> number of point to the tree by its presence on it).
> The question is: Are there some structures, or combination of structures, 
> which give a high value to the tree ? Of course, according to the value of 
> each structure, we can see which one has a higher value than the others (ex: 
> structure CV11 has a value of 15, structure IN12 has a value of 4). But what 
> I want to know is, if we take all the trees having a final value higher than 
> 100 (we create a new dataframe "data100"), and we compare with the trees 
> having a final value under 100 (we create another dataframe "data0"), can we 
> find a significant difference in the number and occurrence of structures 
> found on these trees? And which structure is related to trees with a higher 
> value than 100 ?
> For now, I have only a vis

Re: [R] comparition of occurrence of multiple variables between two dataframes

2017-09-12 Thread David L Carlson
You need to learn how to send plain text messages. See below what happened to 
your html table when the list converted it to plain text. It is unreadable.

In your original post, you say "The very last column is the value given to the 
tree, according to the structures found on it (each structure giving a number 
of point to the tree by its presence on it)."

That suggests that the point value is based on the structures. If that is so 
the answer to your question is "yes", higher point values will have different 
numbers/combinations of structures because you computed point value from the 
structures. No statistical test between the two groups will be valid because 
the groups were not formed independently of the structures.

Looking for associations between different types of structures would be a 
different question that would not be based on point value at all but would use 
some measure of association/correlation.


David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77843-4352

-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Suzen, Mehmet
Sent: Tuesday, September 12, 2017 7:11 AM
To: Céline Lüscher 
Cc: r-help@r-project.org
Subject: Re: [R] comparition of occurrence of multiple variables between two 
dataframes

Hi Céline,
Looks like you are looking for a statistical test between two sets of 
distributions, such as KS test, for example, generate histogram for each row in 
an identical way and run KS test.
But if you are after simple difference you may use compare package ( 
https://cran.r-project.org/web/packages/compare/index.html).
Best,
-m
PS: Data is already plural :) datas does not exist.

On 12 September 2017 at 13:57, Céline Lüscher  wrote:

> Yes of course, I can share this short view of the datas.
>
>
>
> Here is the head() of data100, containing all the trees with a final 
> value higher than 100 :
>
> CV11
>
> CV12
>
> CV13
>
> CV14
>
> CV15
>
> CV21
>
> CV22
>
> CV23
>
> CV24
>
> CV25
>
> CV26
>
> CV31
>
> CV32
>
> CV33
>
> CV41
>
> CV42
>
> CV43
>
> CV44
>
> CV51
>
> CV52
>
> IN11
>
> IN12
>
> IN13
>
> 1291
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1083
>
> 0
>
> 4
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 3919
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 2
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 14685
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 4021
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 5452
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> IN14
>
> IN21
>
> IN22
>
> IN23
>
> IN31
>
> IN32
>
> IN33
>
> IN34
>
> BA11
>
> BA12
>
> BA21
>
> DE11
>
> DE12
>
> DE13
>
> DE14
>
> DE15
>
> GR11
>
> GR12
>
> GR13
>
> GR21
>
> GR22
>
> GR31
>
> GR32
>
> 1291
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 30
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1083
>
> 3
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0

Re: [R] comparition of occurrence of multiple variables between two dataframes

2017-09-12 Thread Suzen, Mehmet
t;
> NE21
>
> OT11
>
> OT12
>
> OT21
>
> OT22
>
> ecoval
>
> 1291
>
> 0
>
> 8
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1192
>
> 1083
>
> 0
>
> 8
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 424
>
> 3919
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 380
>
> 14685
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 370
>
> 4021
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 358
>
> 5452
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 11
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 356
>
>
>
> The columns are the possible structures found on a tree (cavity, scar…)
>
>
>
> And the same for the data0 :
>
> CV11
>
> CV12
>
> CV13
>
> CV14
>
> CV15
>
> CV21
>
> CV22
>
> CV23
>
> CV24
>
> CV25
>
> CV26
>
> CV31
>
> CV32
>
> CV33
>
> CV41
>
> CV42
>
> CV43
>
> CV44
>
> CV51
>
> CV52
>
> IN11
>
> IN12
>
> IN13
>
> 4728
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 3
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 1
>
> 0
>
> 0
>
> 5339
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 11766
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 796
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 3561
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 10581
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> IN14
>
> IN21
>
> IN22
>
> IN23
>
> IN31
>
> IN32
>
> IN33
>
> IN34
>
> BA11
>
> BA12
>
> BA21
>
> DE11
>
> DE12
>
> DE13
>
> DE14
>
> DE15
>
> GR11
>
> GR12
>
> GR13
>
> GR21
>
> GR22
>
> GR31
>
> GR32
>
> 4728
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 5339
>
> 1
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 11766
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 1
>
> 1
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 0
>
> 796
>
> 1
>
> 1
>
> 0
>
> 0

Re: [R] comparition of occurrence of multiple variables between two dataframes

2017-09-12 Thread Suzen, Mehmet
Do you have a simplified example with a code? It is not clear to me
what do you mean by tree but if you refer to tree data structure,
maybe you could change the data structure to tree
(https://cran.r-project.org/web/packages/data.tree/vignettes/data.tree.html)
and try to write comparison of two tree objects. It might be easier
that data.frame alone.

On 12 September 2017 at 12:27, Céline Lüscher  wrote:
> Hi everyone, I need your help to solve a problem with occurrence and two 
> dataframes.
> I have an excel table of 15200 lines. Each line correspond to a tree analyzed 
> for its structures. I have all the structures in columns (48 structures). The 
> occurrence of these structures has been counted on every tree. For example, 
> the tree 12607 has 3 structures CV11, 1 structure IN12 and none (0) of the 
> rest of all the other structures. The very last column is the value given to 
> the tree, according to the structures found on it (each structure giving a 
> number of point to the tree by its presence on it).
> The question is: Are there some structures, or combination of structures, 
> which give a high value to the tree ? Of course, according to the value of 
> each structure, we can see which one has a higher value than the others (ex: 
> structure CV11 has a value of 15, structure IN12 has a value of 4). But what 
> I want to know is, if we take all the trees having a final value higher than 
> 100 (we create a new dataframe "data100"), and we compare with the trees 
> having a final value under 100 (we create another dataframe "data0"), can we 
> find a significant difference in the number and occurrence of structures 
> found on these trees? And which structure is related to trees with a higher 
> value than 100 ?
> For now, I have only a visual answer to the question. I did two boxplot of 
> the data100 and data0, and I have seen some différences : 2 structures are 
> only found in the data100, which can be caracteristic of a final value higher 
> than 100. The problem is that I’m looking for a test to prove this.
> If you have any idea or proposition for solving this problem.. it will be 
> great!
> Best wishes,
> C.
>
>
> Gesendet von Mail für Windows 10
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] comparition of occurrence of multiple variables between two dataframes

2017-09-12 Thread Céline Lüscher
Hi everyone, I need your help to solve a problem with occurrence and two 
dataframes.
I have an excel table of 15200 lines. Each line correspond to a tree analyzed 
for its structures. I have all the structures in columns (48 structures). The 
occurrence of these structures has been counted on every tree. For example, the 
tree 12607 has 3 structures CV11, 1 structure IN12 and none (0) of the rest of 
all the other structures. The very last column is the value given to the tree, 
according to the structures found on it (each structure giving a number of 
point to the tree by its presence on it).
The question is: Are there some structures, or combination of structures, which 
give a high value to the tree ? Of course, according to the value of each 
structure, we can see which one has a higher value than the others (ex: 
structure CV11 has a value of 15, structure IN12 has a value of 4). But what I 
want to know is, if we take all the trees having a final value higher than 100 
(we create a new dataframe "data100"), and we compare with the trees having a 
final value under 100 (we create another dataframe "data0"), can we find a 
significant difference in the number and occurrence of structures found on 
these trees? And which structure is related to trees with a higher value than 
100 ?
For now, I have only a visual answer to the question. I did two boxplot of the 
data100 and data0, and I have seen some différences : 2 structures are only 
found in the data100, which can be caracteristic of a final value higher than 
100. The problem is that I’m looking for a test to prove this.
If you have any idea or proposition for solving this problem.. it will be great!
Best wishes,
C.


Gesendet von Mail für Windows 10


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.