Re: [R] log transform a data frame

2023-06-13 Thread David Carlson via R-help
Try this
pdf("~/graph.pdf")
par(mar=c(8, 4, 4, 2))
barplot(d2, legend= c("SYCL", "CUDA"), beside=
TRUE,las=2,cex.axis=0.7,cex.names=0.7,ylim=c(0,80), col=c("#9e9ac8",
"#6a51a3"))
dev.off()

See ?par to see the details for adjusting margins and other plot features.

David


On Tue, Jun 13, 2023 at 5:20 PM Ana Marija 
wrote:

> Thank you so much David, here is correction: d1=suppressWarnings(read.
> csv("/Users/anamaria/Downloads/B1. csv", stringsAsFactors=FALSE,
> header=TRUE)) d1$X <- NULL d2=as. matrix(sapply(d1, as. numeric))
> pdf("~/graph. pdf")b<-barplot(d2,
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
> Thank you so much David, here is correction:
>
> d1=suppressWarnings(read.csv("/Users/anamaria/Downloads/B1.csv",
> stringsAsFactors=FALSE, header=TRUE))
> d1$X <- NULL
> d2=as.matrix(sapply(d1, as.numeric))
> pdf("~/graph.pdf")
> b<-barplot(d2, legend= c("SYCL", "CUDA"), beside=
> TRUE,las=2,cex.axis=0.7,cex.names=0.7,ylim=c(0,80), col=c("#9e9ac8",
> "#6a51a3"))
> dev.off()
>
>  > dput(head(d1))
> structure(list(Domain.decomp. = c("2. 1", "2"), DD.com..load = c(0L,
> 0L), Neighbor.search = c("3.7", "3. 1"), Launch.PP.GPU.ops. = c("0. 1",
> "0"), Comm..coord. = c("1 .6", "1 .0"), Force = c("1 . 5", "1 .2"
> ), Wait...Comm..F = c("1 .3", "1 .7"), PIE.mesh = c(65.6, 70.9
> ), Wait.Bonded.GPU = c(0L, 0L), wait.GPU.NB.nonloc. = c(0L, 0L
> ), Wait.GPU.NB.local = c(0L, 0L), NB.X.F.buffer.ops. = c(7.3,
> 4.4), Write.traje = c(0.3, 0.3), Update = c(6.3, 4.3), Constraints =
> c(8.9,
> 9.7), Comm..energies = c(0.9, 0.9), PIE.redist..X.F = c("8. 1",
> "8.7"), PIE.spread = c(29.7, 30.6), PIE.gather = c("19.9", "21 .3"
> ), PIE.3D.FFT = c(6, 8.6), PIE.3D.FFT.comm. = c("1 .2", "1 .0"
> ), PIE.solve.Elec = c(0.7, 0.5)), row.names = 1:2, class = "data.frame")
>
> Now my problem is that when I save my plot as PDF my labels on X axis are
> cut off. Any advice about that?
>
>
>
> On Tue, Jun 13, 2023 at 5:14 PM David Carlson  wrote:
>
>> Your first data column appears to contain character data (e.g. SYCL) which
>> cannot be converted to numeric. You also appear to have 0's in the numeric
>> columns which will cause problems since log(0) is -Inf. Barplots are useful
>> for categorical data, but not continuous, numeric data which are better
>> handled with box plots or strip charts.
>>
>> Do not use printouts of your data since it hides important information.
>> Use str(a11) and dput(a11) or dput(head(a11)) to provide useful information
>> about your data.
>>
>> David L Carlson
>> Texas A University
>>
>>
>> On Tue, Jun 13, 2023 at 4:08 PM Ana Marija 
>> wrote:
>>
>>> Hello, I have a data frame like this: d11=suppressWarnings(read.
>>> csv("/Users/anamaria/Downloads/B1. csv", stringsAsFactors=FALSE,
>>> header=TRUE)) > d11 X Domain. decomp. DD. com. . load Neighbor. search
>>> Launch. PP. GPU. ops. Comm. . coord. 1 SYCL 2. 1
>>> ZjQcmQRYFpfptBannerStart
>>> This Message Is From an External Sender
>>> This message came from outside your organization.
>>>
>>> ZjQcmQRYFpfptBannerEnd
>>>
>>> Hello,
>>>
>>> I have a data frame like this:
>>>
>>> d11=suppressWarnings(read.csv("/Users/anamaria/Downloads/B1.csv",
>>> stringsAsFactors=FALSE, header=TRUE))
>>>
>>> > d11
>>>  X Domain.decomp. DD.com..load Neighbor.search Launch.PP.GPU.ops.
>>> Comm..coord.
>>> 1 SYCL   2. 10 3.7   0. 1
>>>   1 .6
>>> 2 CUDA  203. 1  0
>>>   1 .0
>>>   Force Wait...Comm..F PIE.mesh Wait.Bonded.GPU wait.GPU.NB.nonloc.
>>> 1 1 . 5   1 .3 65.6   0   0
>>> 2  1 .2   1 .7 70.9   0   0
>>>   Wait.GPU.NB.local NB.X.F.buffer.ops. Write.traje Update Constraints
>>> Comm..energies
>>> 1 07.3 0.36.3 8.9
>>>  0.9
>>> 2 04.4 0.34.3 9.7
>>>  0.9
>>>   PIE.redist..X.F PIE.spread PIE.gather PIE.3D.FFT PIE.3D.FFT.comm.
>>> PIE.solve.Elec
>>> 18. 1   29.7   19.96.0 1 .2
>>>0.7
>>> 2 8.7   30.6  21 .38.6 1 .0
>>>0.5
>>>
>>> I am trying to log transform the whole data frame, but I get this error:
>>>
>>> > d1=log(d11)
>>> Error in Math.data.frame(d11) :
>>>   non-numeric variable(s) in data frame: X, Domain.decomp.,
>>> Neighbor.search, Launch.PP.GPU.ops., Comm..coord., Force, Wait...Comm..F,
>>> PIE.redist..X.F, PIE.gather, PIE.3D.FFT.comm
>>>
>>>
>>> My goal is to make a stacked barplot like this:
>>> d2=as.matrix(sapply(d1, as.numeric))
>>> b<-barplot(d2, legend= rownames(data2), beside=
>>> TRUE,las=2,cex.axis=0.7,cex.names=0.7,ylim=c(0,80), col=c("#9e9ac8",
>>> "#6a51a3"))
>>>
>>> If I don't log transform  my code runs.
>>>
>>> Please advise,
>>> Ana
>>>
>>> [[alternative HTML version 

Re: [R] Problem with filling dataframe's column

2023-06-13 Thread avi.e.gross
Bert,
 
I stand corrected. What I said may have once been true but apparently the 
implementation seems to have changed at some level.
 
I did not factor that in.
 
Nevertheless, whether you use an index as a key or as an offset into an 
attached vector of labels, it seems to work the same and I think my comment 
applies well enough that changing a few labels instead of scanning lots of 
entries can sometimes be a good think. As far as I can tell, external interface 
seem the same for now. 
 
One issue with R for a long time was how they did not do something more like a 
Python dictionary and it looks like …
 
ABOVE
 
From: Bert Gunter  
Sent: Tuesday, June 13, 2023 6:15 PM
To: avi.e.gr...@gmail.com
Cc: javad bayat ; R-help@r-project.org
Subject: Re: [R] Problem with filling dataframe's column
 
Below.


On Tue, Jun 13, 2023 at 2:18 PM mailto:avi.e.gr...@gmail.com> > wrote:
>
>  
> Javad,
>
> There may be nothing wrong with the methods people are showing you and if it 
> satisfied you, great.
>
> But I note you have lots of data in over a quarter million rows. If much of 
> the text data is redundant, and you want to simplify some operations such as 
> changing some of the values to others I multiple ways, have you done any 
> learning about an R feature very useful for dealing with categorical data 
> called "factors"?
>
> If you have a vector or a column in a data.frame that contains text, then it 
> can be replaced by a factor that often takes way less space as it stores a 
> sort of dictionary of all the unique values and just records numbers like 
> 1,2,3 to tell which one each item is.
 
-- This is false. It used to be true a **long time ago**, but R has for quite a 
while used hashing/global string tables to avoid this problem. See here 

  for details/references.
As a result, I think many would argue that working with strings *as strings,* 
not factors, if often a better default, though of course there are still 
situations where factors are useful (e.g. in ordering results by factor levels 
where the desired level order is not alphabetical).
 
**I would appreciate correction/ clarification if my claims are wrong or 
misleading! **
 
In any case, please do check such claims before making them on this list.
 
Cheers,
Bert
 
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] log transform a data frame

2023-06-13 Thread Ana Marija
Thank you so much David, here is correction:

d1=suppressWarnings(read.csv("/Users/anamaria/Downloads/B1.csv",
stringsAsFactors=FALSE, header=TRUE))
d1$X <- NULL
d2=as.matrix(sapply(d1, as.numeric))
pdf("~/graph.pdf")
b<-barplot(d2, legend= c("SYCL", "CUDA"), beside=
TRUE,las=2,cex.axis=0.7,cex.names=0.7,ylim=c(0,80), col=c("#9e9ac8",
"#6a51a3"))
dev.off()

 > dput(head(d1))
structure(list(Domain.decomp. = c("2. 1", "2"), DD.com..load = c(0L,
0L), Neighbor.search = c("3.7", "3. 1"), Launch.PP.GPU.ops. = c("0. 1",
"0"), Comm..coord. = c("1 .6", "1 .0"), Force = c("1 . 5", "1 .2"
), Wait...Comm..F = c("1 .3", "1 .7"), PIE.mesh = c(65.6, 70.9
), Wait.Bonded.GPU = c(0L, 0L), wait.GPU.NB.nonloc. = c(0L, 0L
), Wait.GPU.NB.local = c(0L, 0L), NB.X.F.buffer.ops. = c(7.3,
4.4), Write.traje = c(0.3, 0.3), Update = c(6.3, 4.3), Constraints = c(8.9,
9.7), Comm..energies = c(0.9, 0.9), PIE.redist..X.F = c("8. 1",
"8.7"), PIE.spread = c(29.7, 30.6), PIE.gather = c("19.9", "21 .3"
), PIE.3D.FFT = c(6, 8.6), PIE.3D.FFT.comm. = c("1 .2", "1 .0"
), PIE.solve.Elec = c(0.7, 0.5)), row.names = 1:2, class = "data.frame")

Now my problem is that when I save my plot as PDF my labels on X axis are
cut off. Any advice about that?



On Tue, Jun 13, 2023 at 5:14 PM David Carlson  wrote:

> Your first data column appears to contain character data (e.g. SYCL) which
> cannot be converted to numeric. You also appear to have 0's in the numeric
> columns which will cause problems since log(0) is -Inf. Barplots are useful
> for categorical data, but not continuous, numeric data which are better
> handled with box plots or strip charts.
>
> Do not use printouts of your data since it hides important information.
> Use str(a11) and dput(a11) or dput(head(a11)) to provide useful information
> about your data.
>
> David L Carlson
> Texas A University
>
>
> On Tue, Jun 13, 2023 at 4:08 PM Ana Marija 
> wrote:
>
>> Hello, I have a data frame like this: d11=suppressWarnings(read.
>> csv("/Users/anamaria/Downloads/B1. csv", stringsAsFactors=FALSE,
>> header=TRUE)) > d11 X Domain. decomp. DD. com. . load Neighbor. search
>> Launch. PP. GPU. ops. Comm. . coord. 1 SYCL 2. 1
>> ZjQcmQRYFpfptBannerStart
>> This Message Is From an External Sender
>> This message came from outside your organization.
>>
>> ZjQcmQRYFpfptBannerEnd
>>
>> Hello,
>>
>> I have a data frame like this:
>>
>> d11=suppressWarnings(read.csv("/Users/anamaria/Downloads/B1.csv",
>> stringsAsFactors=FALSE, header=TRUE))
>>
>> > d11
>>  X Domain.decomp. DD.com..load Neighbor.search Launch.PP.GPU.ops.
>> Comm..coord.
>> 1 SYCL   2. 10 3.7   0. 1
>>   1 .6
>> 2 CUDA  203. 1  0
>>   1 .0
>>   Force Wait...Comm..F PIE.mesh Wait.Bonded.GPU wait.GPU.NB.nonloc.
>> 1 1 . 5   1 .3 65.6   0   0
>> 2  1 .2   1 .7 70.9   0   0
>>   Wait.GPU.NB.local NB.X.F.buffer.ops. Write.traje Update Constraints
>> Comm..energies
>> 1 07.3 0.36.3 8.9
>>  0.9
>> 2 04.4 0.34.3 9.7
>>  0.9
>>   PIE.redist..X.F PIE.spread PIE.gather PIE.3D.FFT PIE.3D.FFT.comm.
>> PIE.solve.Elec
>> 18. 1   29.7   19.96.0 1 .2
>>0.7
>> 2 8.7   30.6  21 .38.6 1 .0
>>0.5
>>
>> I am trying to log transform the whole data frame, but I get this error:
>>
>> > d1=log(d11)
>> Error in Math.data.frame(d11) :
>>   non-numeric variable(s) in data frame: X, Domain.decomp.,
>> Neighbor.search, Launch.PP.GPU.ops., Comm..coord., Force, Wait...Comm..F,
>> PIE.redist..X.F, PIE.gather, PIE.3D.FFT.comm
>>
>>
>> My goal is to make a stacked barplot like this:
>> d2=as.matrix(sapply(d1, as.numeric))
>> b<-barplot(d2, legend= rownames(data2), beside=
>> TRUE,las=2,cex.axis=0.7,cex.names=0.7,ylim=c(0,80), col=c("#9e9ac8",
>> "#6a51a3"))
>>
>> If I don't log transform  my code runs.
>>
>> Please advise,
>> Ana
>>
>>  [[alternative HTML version deleted]]
>>
>> __r-h...@r-project.org mailing 
>> list -- To UNSUBSCRIBE and more, 
>> seehttps://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!GkOclaf0NPpoVxF8zs_a2pCGlBelsouhJKKR4wG4cG_gEycZ6t-N6nbPvxD1AxnYureFFthr_Nc-zXCU0czR4zGIstg$
>> PLEASE do read the posting guide 
>> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!GkOclaf0NPpoVxF8zs_a2pCGlBelsouhJKKR4wG4cG_gEycZ6t-N6nbPvxD1AxnYureFFthr_Nc-zXCU0czRLvLqcYM$
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

Re: [R] Problem with filling dataframe's column

2023-06-13 Thread Bert Gunter
Below.


On Tue, Jun 13, 2023 at 2:18 PM  wrote:
>
>
> Javad,
>
> There may be nothing wrong with the methods people are showing you and if
it satisfied you, great.
>
> But I note you have lots of data in over a quarter million rows. If much
of the text data is redundant, and you want to simplify some operations
such as changing some of the values to others I multiple ways, have you
done any learning about an R feature very useful for dealing with
categorical data called "factors"?
>
> If you have a vector or a column in a data.frame that contains text, then
it can be replaced by a factor that often takes way less space as it stores
a sort of dictionary of all the unique values and just records numbers like
1,2,3 to tell which one each item is.

-- This is false. It used to be true a **long time ago**, but R has for
quite a while used hashing/global string tables to avoid this problem. See
here

for details/references.
As a result, I think many would argue that working with strings *as
strings,* not factors, if often a better default, though of course there
are still situations where factors are useful (e.g. in ordering results by
factor levels where the desired level order is not alphabetical).

**I would appreciate correction/ clarification if my claims are wrong or
misleading! **

In any case, please do check such claims before making them on this list.

Cheers,
Bert


>
> You can access the values using levels(whatever) and also change them.
There are packages that make this straightforward such as forcats which is
one of the tidyverse packages that also includes many other tools some find
useful but are beyond the usual scope of this mailing list.
>
> As an example, if you have a vector in mydata$col1 then code like:
>
> mydata$col1 <- factor(mydata$col1)
>
> No matter which way you do it, you can now access the levels and make
whatever changes, and save the changes. One example could be to apply some
variant of grep to make the substitution. There is a family of functions
build in such as sub() that matches a Regular Expression and replaces it
with what you want.
>
> This has a similar result to changing all entries without doing all the
work. I mean if item 5 used to be "OLD" and is now "NEW" then any of you
quarter million entries that have a 5 will now be seen as having a value of
"NEW".
>
> I will stop here and suggest you may want to read some book that explains
R as a unified set of features with some emphasis on using it for the
features it is intended to have that can make life easier, rather than
using just features it shares with most languages. Some of your questions
indicate you have less grounding and are mainly following recipes you
stumble across.
>
> Otherwise, you will have a collection of what you call "codes" and others
like me call programming and that don't necessarily fit well together.
>
>
> -Original Message-
> From: R-help r-help-boun...@r-project.org   On Behalf Of javad bayat
> Sent: Tuesday, June 13, 2023 3:47 PM
> To: Eric Berger ericjber...@gmail.com 
> Cc: R-help@r-project.org 
> Subject: Re: [R] Problem with filling dataframe's column
>
> Dear all;
> I used these codes and I get what I wanted.
> Sincerely
>
> pat = c("Level 12","Level 22","0")
> data3 = data2[-which(data2$Layer == pat),]
> dim(data2)
> [1] 281549  9
> dim(data3)
> [1] 244075  9
>
> On Tue, Jun 13, 2023 at 11:36 AM Eric Berger <  ericjber...@gmail.com> wrote:
>
> > Hi Javed,
> > grep returns the positions of the matches. See an example below.
> >
> > > v <- c("abc", "bcd", "def")
> > > v
> > [1] "abc" "bcd" "def"
> > > grep("cd",v)
> > [1] 2
> > > w <- v[-grep("cd",v)]
> > > w
> > [1] "abc" "def"
> > >
> >
> >
> > On Tue, Jun 13, 2023 at 8:50 AM javad bayat <  j.bayat...@gmail.com> wrote:
> > >
> > > Dear Rui;
> > > Hi. I used your codes, but it seems it didn't work for me.
> > >
> > > > pat <- c("_esmdes|_Des Section|0")
> > > > dim(data2)
> > > [1]  281549  9
> > > > grep(pat, data2$Layer)
> > > > dim(data2)
> > > [1]  281549  9
> > >
> > > What does grep function do? I expected the function to remove 3 rows
of
> > the
> > > dataframe.
> > > I do not know the reason.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Jun 12, 2023 at 5:16 PM Rui Barradas <  ruipbarra...@sapo.pt>
> > wrote:
> > >
> > > > Às 23:13 de 12/06/2023, javad bayat escreveu:
> > > > > Dear Rui;
> > > > > Many thanks for the email. I tried your codes and found that the
> > length
> > > > of
> > > > > the "Values" and "Names" vectors must be equal, otherwise the
results
> > > > will
> > > > > not be useful.
> > > > > For some of the characters in the Layer column that I do not need
to
> > be
> > > > > filled in the LU column, I used "NA".
> > > > > But I need to delete some of the rows from the table as they are
> > useless
> > > > > for me. I tried this 

Re: [R] log transform a data frame

2023-06-13 Thread David Carlson via R-help
Your first data column appears to contain character data (e.g. SYCL) which
cannot be converted to numeric. You also appear to have 0's in the numeric
columns which will cause problems since log(0) is -Inf. Barplots are useful
for categorical data, but not continuous, numeric data which are better
handled with box plots or strip charts.

Do not use printouts of your data since it hides important information. Use
str(a11) and dput(a11) or dput(head(a11)) to provide useful information
about your data.

David L Carlson
Texas A University


On Tue, Jun 13, 2023 at 4:08 PM Ana Marija 
wrote:

> Hello, I have a data frame like this: d11=suppressWarnings(read.
> csv("/Users/anamaria/Downloads/B1. csv", stringsAsFactors=FALSE,
> header=TRUE)) > d11 X Domain. decomp. DD. com. . load Neighbor. search
> Launch. PP. GPU. ops. Comm. . coord. 1 SYCL 2. 1
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> Hello,
>
> I have a data frame like this:
>
> d11=suppressWarnings(read.csv("/Users/anamaria/Downloads/B1.csv",
> stringsAsFactors=FALSE, header=TRUE))
>
> > d11
>  X Domain.decomp. DD.com..load Neighbor.search Launch.PP.GPU.ops.
> Comm..coord.
> 1 SYCL   2. 10 3.7   0. 1
>   1 .6
> 2 CUDA  203. 1  0
>   1 .0
>   Force Wait...Comm..F PIE.mesh Wait.Bonded.GPU wait.GPU.NB.nonloc.
> 1 1 . 5   1 .3 65.6   0   0
> 2  1 .2   1 .7 70.9   0   0
>   Wait.GPU.NB.local NB.X.F.buffer.ops. Write.traje Update Constraints
> Comm..energies
> 1 07.3 0.36.3 8.9
>  0.9
> 2 04.4 0.34.3 9.7
>  0.9
>   PIE.redist..X.F PIE.spread PIE.gather PIE.3D.FFT PIE.3D.FFT.comm.
> PIE.solve.Elec
> 18. 1   29.7   19.96.0 1 .2
>0.7
> 2 8.7   30.6  21 .38.6 1 .0
>0.5
>
> I am trying to log transform the whole data frame, but I get this error:
>
> > d1=log(d11)
> Error in Math.data.frame(d11) :
>   non-numeric variable(s) in data frame: X, Domain.decomp.,
> Neighbor.search, Launch.PP.GPU.ops., Comm..coord., Force, Wait...Comm..F,
> PIE.redist..X.F, PIE.gather, PIE.3D.FFT.comm
>
>
> My goal is to make a stacked barplot like this:
> d2=as.matrix(sapply(d1, as.numeric))
> b<-barplot(d2, legend= rownames(data2), beside=
> TRUE,las=2,cex.axis=0.7,cex.names=0.7,ylim=c(0,80), col=c("#9e9ac8",
> "#6a51a3"))
>
> If I don't log transform  my code runs.
>
> Please advise,
> Ana
>
>   [[alternative HTML version deleted]]
>
> __r-h...@r-project.org mailing 
> list -- To UNSUBSCRIBE and more, 
> seehttps://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!GkOclaf0NPpoVxF8zs_a2pCGlBelsouhJKKR4wG4cG_gEycZ6t-N6nbPvxD1AxnYureFFthr_Nc-zXCU0czR4zGIstg$
> PLEASE do read the posting guide 
> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!KwNVnqRv!GkOclaf0NPpoVxF8zs_a2pCGlBelsouhJKKR4wG4cG_gEycZ6t-N6nbPvxD1AxnYureFFthr_Nc-zXCU0czRLvLqcYM$
> and provide commented, minimal, self-contained, reproducible code.
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with filling dataframe's column

2023-06-13 Thread avi.e.gross
 
Javad,
 
There may be nothing wrong with the methods people are showing you and if it 
satisfied you, great.
 
But I note you have lots of data in over a quarter million rows. If much of the 
text data is redundant, and you want to simplify some operations such as 
changing some of the values to others I multiple ways, have you done any 
learning about an R feature very useful for dealing with categorical data 
called "factors"?
 
If you have a vector or a column in a data.frame that contains text, then it 
can be replaced by a factor that often takes way less space as it stores a sort 
of dictionary of all the unique values and just records numbers like 1,2,3 to 
tell which one each item is. 
 
You can access the values using levels(whatever) and also change them. There 
are packages that make this straightforward such as forcats which is one of the 
tidyverse packages that also includes many other tools some find useful but are 
beyond the usual scope of this mailing list.
 
As an example, if you have a vector in mydata$col1 then code like:
 
mydata$col1 <- factor(mydata$col1)
 
No matter which way you do it, you can now access the levels and make whatever 
changes, and save the changes. One example could be to apply some variant of 
grep to make the substitution. There is a family of functions build in such as 
sub() that matches a Regular Expression and replaces it with what you want.
 
This has a similar result to changing all entries without doing all the work. I 
mean if item 5 used to be "OLD" and is now "NEW" then any of you quarter 
million entries that have a 5 will now be seen as having a value of "NEW".
 
I will stop here and suggest you may want to read some book that explains R as 
a unified set of features with some emphasis on using it for the features it is 
intended to have that can make life easier, rather than using just features it 
shares with most languages. Some of your questions indicate you have less 
grounding and are mainly following recipes you stumble across. 
 
Otherwise, you will have a collection of what you call "codes" and others like 
me call programming and that don't necessarily fit well together.
 
 
-Original Message-
From: R-help r-help-boun...@r-project.org  
 On Behalf Of javad bayat
Sent: Tuesday, June 13, 2023 3:47 PM
To: Eric Berger ericjber...@gmail.com  
Cc: R-help@r-project.org  
Subject: Re: [R] Problem with filling dataframe's column
 
Dear all;
I used these codes and I get what I wanted.
Sincerely
 
pat = c("Level 12","Level 22","0")
data3 = data2[-which(data2$Layer == pat),]
dim(data2)
[1] 281549  9
dim(data3)
[1] 244075  9
 
On Tue, Jun 13, 2023 at 11:36 AM Eric Berger <  
ericjber...@gmail.com> wrote:
 
> Hi Javed,
> grep returns the positions of the matches. See an example below.
> 
> > v <- c("abc", "bcd", "def")
> > v
> [1] "abc" "bcd" "def"
> > grep("cd",v)
> [1] 2
> > w <- v[-grep("cd",v)]
> > w
> [1] "abc" "def"
> >
> 
> 
> On Tue, Jun 13, 2023 at 8:50 AM javad bayat <  
> j.bayat...@gmail.com> wrote:
> >
> > Dear Rui;
> > Hi. I used your codes, but it seems it didn't work for me.
> >
> > > pat <- c("_esmdes|_Des Section|0")
> > > dim(data2)
> > [1]  281549  9
> > > grep(pat, data2$Layer)
> > > dim(data2)
> > [1]  281549  9
> >
> > What does grep function do? I expected the function to remove 3 rows of
> the
> > dataframe.
> > I do not know the reason.
> >
> >
> >
> >
> >
> >
> > On Mon, Jun 12, 2023 at 5:16 PM Rui Barradas < 
> >  ruipbarra...@sapo.pt>
> wrote:
> >
> > > Às 23:13 de 12/06/2023, javad bayat escreveu:
> > > > Dear Rui;
> > > > Many thanks for the email. I tried your codes and found that the
> length
> > > of
> > > > the "Values" and "Names" vectors must be equal, otherwise the results
> > > will
> > > > not be useful.
> > > > For some of the characters in the Layer column that I do not need to
> be
> > > > filled in the LU column, I used "NA".
> > > > But I need to delete some of the rows from the table as they are
> useless
> > > > for me. I tried this code to delete entire rows of the dataframe
> which
> > > > contained these three value in the Layer column: It gave me the
> following
> > > > error.
> > > >
> > > >> data3 = data2[-grep(c("_esmdes","_Des Section","0"), data2$Layer),]
> > > >   Warning message:
> > > >In grep(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> > > >argument 'pattern' has length > 1 and only the first element
> will
> > > be
> > > > used
> > > >
> > > >> data3 = data2[!grepl(c("_esmdes","_Des Section","0"), data2$Layer),]
> > > >  Warning message:
> > > >  In grepl(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> > > >  argument 'pattern' has length > 1 and only the first element
> will be
> > > > used
> > > >
> > > > How can I do this?
> > > > 

[R] log transform a data frame

2023-06-13 Thread Ana Marija
Hello,

I have a data frame like this:

d11=suppressWarnings(read.csv("/Users/anamaria/Downloads/B1.csv",
stringsAsFactors=FALSE, header=TRUE))

> d11
 X Domain.decomp. DD.com..load Neighbor.search Launch.PP.GPU.ops.
Comm..coord.
1 SYCL   2. 10 3.7   0. 1
  1 .6
2 CUDA  203. 1  0
  1 .0
  Force Wait...Comm..F PIE.mesh Wait.Bonded.GPU wait.GPU.NB.nonloc.
1 1 . 5   1 .3 65.6   0   0
2  1 .2   1 .7 70.9   0   0
  Wait.GPU.NB.local NB.X.F.buffer.ops. Write.traje Update Constraints
Comm..energies
1 07.3 0.36.3 8.9
 0.9
2 04.4 0.34.3 9.7
 0.9
  PIE.redist..X.F PIE.spread PIE.gather PIE.3D.FFT PIE.3D.FFT.comm.
PIE.solve.Elec
18. 1   29.7   19.96.0 1 .2
   0.7
2 8.7   30.6  21 .38.6 1 .0
   0.5

I am trying to log transform the whole data frame, but I get this error:

> d1=log(d11)
Error in Math.data.frame(d11) :
  non-numeric variable(s) in data frame: X, Domain.decomp.,
Neighbor.search, Launch.PP.GPU.ops., Comm..coord., Force, Wait...Comm..F,
PIE.redist..X.F, PIE.gather, PIE.3D.FFT.comm


My goal is to make a stacked barplot like this:
d2=as.matrix(sapply(d1, as.numeric))
b<-barplot(d2, legend= rownames(data2), beside=
TRUE,las=2,cex.axis=0.7,cex.names=0.7,ylim=c(0,80), col=c("#9e9ac8",
"#6a51a3"))

If I don't log transform  my code runs.

Please advise,
Ana

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with filling dataframe's column

2023-06-13 Thread Bill Dunlap
It is safer to use !grepl(...) instead of -grep(...) here.  If there are no
matches, the latter will give you a zero-row data.frame while the former
gives you the entire data.frame.

E.g.,

> d <- data.frame(a=c("one","two","three"), b=c(10,20,30))
> d[-grep("Q", d$a),]
[1] a b
<0 rows> (or 0-length row.names)
> d[!grepl("Q", d$a),]
  a  b
1   one 10
2   two 20
3 three 30

-Bill

On Tue, Jun 13, 2023 at 6:19 AM Rui Barradas  wrote:

> Às 17:18 de 13/06/2023, javad bayat escreveu:
> > Dear Rui;
> > Hi. I used your codes, but it seems it didn't work for me.
> >
> >> pat <- c("_esmdes|_Des Section|0")
> >> dim(data2)
> >  [1]  281549  9
> >> grep(pat, data2$Layer)
> >> dim(data2)
> >  [1]  281549  9
> >
> > What does grep function do? I expected the function to remove 3 rows of
> the
> > dataframe.
> > I do not know the reason.
> >
> >
> >
> >
> >
> >
> > On Mon, Jun 12, 2023 at 5:16 PM Rui Barradas 
> wrote:
> >
> >> Às 23:13 de 12/06/2023, javad bayat escreveu:
> >>> Dear Rui;
> >>> Many thanks for the email. I tried your codes and found that the length
> >> of
> >>> the "Values" and "Names" vectors must be equal, otherwise the results
> >> will
> >>> not be useful.
> >>> For some of the characters in the Layer column that I do not need to be
> >>> filled in the LU column, I used "NA".
> >>> But I need to delete some of the rows from the table as they are
> useless
> >>> for me. I tried this code to delete entire rows of the dataframe which
> >>> contained these three value in the Layer column: It gave me the
> following
> >>> error.
> >>>
>  data3 = data2[-grep(c("_esmdes","_Des Section","0"), data2$Layer),]
> >>>Warning message:
> >>> In grep(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> >>> argument 'pattern' has length > 1 and only the first element
> will
> >> be
> >>> used
> >>>
>  data3 = data2[!grepl(c("_esmdes","_Des Section","0"), data2$Layer),]
> >>>   Warning message:
> >>>   In grepl(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> >>>   argument 'pattern' has length > 1 and only the first element
> will be
> >>> used
> >>>
> >>> How can I do this?
> >>> Sincerely
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Sun, Jun 11, 2023 at 5:03 PM Rui Barradas 
> >> wrote:
> >>>
>  Às 13:18 de 11/06/2023, Rui Barradas escreveu:
> > Às 22:54 de 11/06/2023, javad bayat escreveu:
> >> Dear Rui;
> >> Many thanks for your email. I used one of your codes,
> >> "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it works
> >> correctly for me.
> >> Actually I need to expand the codes so as to consider all "Levels"
> in
>  the
> >> "Layer" column. There are more than hundred levels in the Layer
> >> column.
> >> If I use your provided code, I have to write it hundred of time as
>  below:
> >> data2$LU[which(data2$Layer == "Level 1")] <- "Park";
> >> data2$LU[which(data2$Layer == "Level 2")] <- "Agri";
> >> ...
> >> ...
> >> ...
> >> .
> >> Is there any other way to expand the code in order to consider all
> of
>  the
> >> levels simultaneously? Like the below code:
> >> data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3",
> ...))]
>  <-
> >> c("Park", "Agri", "GS", ...)
> >>
> >>
> >> Sincerely
> >>
> >>
> >>
> >>
> >> On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas 
> >> wrote:
> >>
> >>> Às 21:05 de 11/06/2023, javad bayat escreveu:
>  Dear R users;
>  I am trying to fill a column based on a specific value in another
>  column
> >>> of
>  a dataframe, but it seems there is a problem with the codes!
>  The "Layer" and the "LU" are two different columns of the
> dataframe.
>  How can I fix this?
>  Sincerely
> 
> 
>  for (i in 1:nrow(data2$Layer)){
>    if (data2$Layer == "Level 12") {
>    data2$LU == "Park"
>    }
>    }
> 
> 
> 
> 
> >>> Hello,
> >>>
> >>> There are two bugs in your code,
> >>>
> >>> 1) the index i is not used in the loop
> >>> 2) the assignment operator is `<-`, not `==`
> >>>
> >>>
> >>> Here is the loop corrected.
> >>>
> >>> for (i in 1:nrow(data2$Layer)){
> >>>   if (data2$Layer[i] == "Level 12") {
> >>> data2$LU[i] <- "Park"
> >>>   }
> >>> }
> >>>
> >>>
> >>>
> >>> But R is a vectorized language, the following two ways are the
> >> idiomac
> >>> ways of doing what you want to do.
> >>>
> >>>
> >>>
> >>> i <- data2$Layer == "Level 12"
> >>> data2$LU[i] <- "Park"
> >>>
> >>> # equivalent one-liner
> >>> data2$LU[data2$Layer == "Level 12"] <- "Park"
> >>>
> >>>
> >>>
> >>> If there are NA's in 

Re: [R] Rmarkdown code rendering as LaTeX, not executing?

2023-06-13 Thread Olivier Crouzet
Dear Kevin,

actually you're mixing markdown and LaTeX syntax, which is the reason
why you see LaTeX code in the PDF. You have to choose...

1) Either you wish to produce an RMarkdown document and your sections,
subsections... should read:

# Abstract

In this document, ...

## Boundaries of the Radnor-Winston neighborhood

```{r rw_map,  fig.width = 6, fig.height = 4, out.width = "80%", dev
= "pdf",
fig.cap = "Map of RW neighborhood\label{RWneigh}"}

## Creating a polygon for RW neighborhood, based on CRS 6487 (NAD83
## (2011) / Maryland ) map in meters:
base_x <- 433000
base_y <- 186000

[etc]...
```
But you may not use all the power of LaTeX (easily at least).


2) Or you wish to produce a .Rnw (knitr / Sweave file using LaTeX) and
you should use another R code delimitation convention (and change the
heading part of the document to LaTeX usage, that is
\documentclass{}... \usepackage{}...):

<

## Creating a polygon for RW neighborhood, based on CRS 6487 (NAD83
## (2011) / Maryland ) map in meters:
base_x <- 433000
base_y <- 186000

[etc]

@

Depending on your choice, compiling the document goes through a
different process but both are possible and relatively simple (either
from within RStudio or using any other editor).

Hope this helps for a first approach.

Olivier.





On Tue, 13 Jun 2023 16:29:57 + Kevin
Zembower via R-help  wrote:

> Hi, all,
> 
> I'm trying to compose an Rmarkdown document and render it as a PDF
> file. My first block of R code seems to work okay, but the second on
> seems to be interpreted as LaTeX code, and not executed as R code. In
> the output, the three back-ticks that mark the R code block are
> interpreted as an opening double-quote, followed by an opening single
> quote.
> 
> Here's my test file:
> 
> ---
> title: "An analysis of US 2020 Census Data for the Radnor-Winston 
> neighborhood"
> author: "E. Kevin Zembower"
> date: "29 May 2023"
> output:
> pdf_document:
>extra_dependencies: ["array", "booktabs", "dcolumn"]
> 
> ---
> 
> ```{r setup, include = FALSE}
> 
> ```
> 
> \section{Abstract}
> In this document, ...
> 
> \section{Boundaries of the Radnor-Winston neighborhood}
> 
> ...
> 
>   For the purposes of this report, the
> boundaries of RW are as shown in figure \ref{RWneigh}. ...
> 
> ```{r rw_map,  fig.width = 6, fig.height = 4, out.width = "80%", dev
> = "pdf",
> fig.cap = "Map of RW neighborhood\label{RWneigh}"}
> 
> ## Creating a polygon for RW neighborhood, based on CRS 6487 (NAD83
> ## (2011) / Maryland ) map in meters:
> base_x <- 433000
> base_y <- 186000
> rw_neigh_pg_m <- data.frame(
>  matrix(
>  c(540, 1140,
>540, 1070,
>480, 1060,
>490, 1000,
>570, 1000,
>570, 940,
>550, 930,
>550, 890,
>580, 890,
>590, 820,
>640, 820,
>650, 590,
>520, 580,
>470, 580,
>350, 660,
>350, 710,
>180, 725,
>190, 900,
>220, 900,
>220, 1030,
>240, 1030,
>240, 1110
>  ),
>  ncol = 2, byrow = TRUE)
> ) %>% + matrix(c(rep(base_x, nrow(.)), rep(base_y, nrow(.))),
> nrow = nrow(.)) %>%
> sf::st_as_sf(coords = c(1,2), dim = "XY") %>%
> summarize(geometry = st_combine(geometry)) %>%
> st_cast("POLYGON") %>%
> st_set_crs(6487)
> 
> ## Map it:
> rw_base_blocks <- read_osm(bb(rw_neigh_pg_m, ext = 1.3))
> 
> ## Line below gives map in meters
> (RW_block_map <- tm_shape(rw_base_blocks, projection = 6487) +
> ## Line below gives map in degrees
> ## (RW_block_map <- tm_shape(rw_base_blocks, projection = 6487) +
>   tm_rgb() +
>   tm_shape(rw_neigh_pg_m) +
>   tm_fill(col = "green", alpha = 0.2) +
>   tm_borders(lwd = 2, alpha = 1) +
>   tm_scale_bar() +
>   ## tm_grid() + tm_xlab("Long") + tm_ylab("Lat") +
>   tm_grid() +
>   tm_layout(title = "Radnor-Winston Neighborhood")
> )
> 
> ## tmap_save(RW_block_map, "rw_map.png")
> 
> ```
> 
> 
> This code block can also be obtained from 
> https://gist.github.com/kzembower/f9ad52abf82975102cbf715bcfbc0f51.
> 
> I'm using Emacs and ESS to create this document. This seems to
> produce its own weirdness, as the text style and font color and sizes
> change in the R code block as I edit it and add spaces and lines.
> 
> If the block above is saved as "RW_test.Rmd", I use these lines to 
> create the PDF:
> ===
> library(rmarkdown)
> render("RW_test.Rmd")
> 
> 
> No errors are generated.
> 
> Can anyone help me understand what I'm doing wrong? A much shorter
> test file I created seems to work okay.
> 
> Thanks in advance for any advice.
> 
> -Kevin
> 
>  > sessionInfo()
> R version 4.3.0 (2023-04-21)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 

[R] Rmarkdown code rendering as LaTeX, not executing?

2023-06-13 Thread Kevin Zembower via R-help
Hi, all,

I'm trying to compose an Rmarkdown document and render it as a PDF file. 
My first block of R code seems to work okay, but the second on seems to 
be interpreted as LaTeX code, and not executed as R code. In the output, 
the three back-ticks that mark the R code block are interpreted as an 
opening double-quote, followed by an opening single quote.

Here's my test file:

---
title: "An analysis of US 2020 Census Data for the Radnor-Winston 
neighborhood"
author: "E. Kevin Zembower"
date: "29 May 2023"
output:
pdf_document:
   extra_dependencies: ["array", "booktabs", "dcolumn"]

---

```{r setup, include = FALSE}

```

\section{Abstract}
In this document, ...

\section{Boundaries of the Radnor-Winston neighborhood}

...

  For the purposes of this report, the
boundaries of RW are as shown in figure \ref{RWneigh}. ...

```{r rw_map,  fig.width = 6, fig.height = 4, out.width = "80%", dev = 
"pdf",
fig.cap = "Map of RW neighborhood\label{RWneigh}"}

## Creating a polygon for RW neighborhood, based on CRS 6487 (NAD83
## (2011) / Maryland ) map in meters:
base_x <- 433000
base_y <- 186000
rw_neigh_pg_m <- data.frame(
 matrix(
 c(540, 1140,
   540, 1070,
   480, 1060,
   490, 1000,
   570, 1000,
   570, 940,
   550, 930,
   550, 890,
   580, 890,
   590, 820,
   640, 820,
   650, 590,
   520, 580,
   470, 580,
   350, 660,
   350, 710,
   180, 725,
   190, 900,
   220, 900,
   220, 1030,
   240, 1030,
   240, 1110
 ),
 ncol = 2, byrow = TRUE)
) %>% + matrix(c(rep(base_x, nrow(.)), rep(base_y, nrow(.))),
nrow = nrow(.)) %>%
sf::st_as_sf(coords = c(1,2), dim = "XY") %>%
summarize(geometry = st_combine(geometry)) %>%
st_cast("POLYGON") %>%
st_set_crs(6487)

## Map it:
rw_base_blocks <- read_osm(bb(rw_neigh_pg_m, ext = 1.3))

## Line below gives map in meters
(RW_block_map <- tm_shape(rw_base_blocks, projection = 6487) +
## Line below gives map in degrees
## (RW_block_map <- tm_shape(rw_base_blocks, projection = 6487) +
  tm_rgb() +
  tm_shape(rw_neigh_pg_m) +
  tm_fill(col = "green", alpha = 0.2) +
  tm_borders(lwd = 2, alpha = 1) +
  tm_scale_bar() +
  ## tm_grid() + tm_xlab("Long") + tm_ylab("Lat") +
  tm_grid() +
  tm_layout(title = "Radnor-Winston Neighborhood")
)

## tmap_save(RW_block_map, "rw_map.png")

```


This code block can also be obtained from 
https://gist.github.com/kzembower/f9ad52abf82975102cbf715bcfbc0f51.

I'm using Emacs and ESS to create this document. This seems to produce 
its own weirdness, as the text style and font color and sizes change in 
the R code block as I edit it and add spaces and lines.

If the block above is saved as "RW_test.Rmd", I use these lines to 
create the PDF:
===
library(rmarkdown)
render("RW_test.Rmd")


No errors are generated.

Can anyone help me understand what I'm doing wrong? A much shorter test 
file I created seems to work okay.

Thanks in advance for any advice.

-Kevin

 > sessionInfo()
R version 4.3.0 (2023-04-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: 
/usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; 
LAPACK version 3.10.0

locale:
  [1] LC_CTYPE=en_US.UTF-8  LC_NUMERIC=C 
LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8LC_MONETARY=en_US.UTF-8 
LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8  LC_NAME=en_US.UTF-8 
LC_ADDRESS=en_US.UTF-8
[10] LC_TELEPHONE=en_US.UTF-8  LC_MEASUREMENT=en_US.UTF-8 
LC_IDENTIFICATION=en_US.UTF-8

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
  [1] kableExtra_1.3.4 tidycensus_1.4   lubridate_1.9.2  forcats_1.0.0 
  stringr_1.5.0dplyr_1.1.2
  [7] purrr_1.0.1  readr_2.1.4  tidyr_1.3.0  tibble_3.2.1 
  ggplot2_3.4.2tidyverse_2.0.0
[13] rmarkdown_2.22

loaded via a namespace (and not attached):
  [1] gtable_0.3.3xfun_0.39   raster_3.6-20 
tigris_2.0.3rJava_1.0-6
  [6] lattice_0.21-8  tzdb_0.4.0  vctrs_0.6.2 
tools_4.3.0 generics_0.1.3
[11] curl_5.0.0  proxy_0.4-27fansi_1.0.4 
pkgconfig_2.0.3 KernSmooth_2.23-21
[16] webshot_0.5.4   uuid_1.1-0  lifecycle_1.0.3 
compiler_4.3.0  munsell_0.5.0
[21] tinytex_0.45terra_1.7-29codetools_0.2-19 
htmltools_0.5.5 class_7.3-22
[26] yaml_2.3.7  crayon_1.5.2pillar_1.9.0 
classInt_0.4-9  tidyselect_1.2.0
[31] rvest_1.0.3 digest_0.6.31   

[R-es] importar txt con separador decimal y de miles

2023-06-13 Thread Sebastian Kruk
Estimados usuarios R,

Muy buenos días.

Tengo un archivo de texto en el que la primera fila contiene los
nombres de las columnas y la primera columna tiene los nombres de las
filas.

Todos los números tienen como separador decimal la coma y como
separador de miles el punto.

Las primeras cinco fila del archivo se verían así al abrirlas con el
bloc de notas en Windows:

Estacion "Mes 1" "Mes 2" "Mes 3" "Mes 4" "Mes 5" "Mes 6" "Mes 7" "Mes
8" "Mes 9" "Mes 10" "Mes 11" "Mes 12"
"ES 1" 242,142 251,515 296,482 252,345 241,439 269,308 295,04 275,97
279,858 291,124 296,004 319,853
"ES 2" 19,884 32,892 41,969 38,997 43,0 27,151 35,369 27,292 37,133
40,073 39,815 43,023
"ES 3" 0,1 0,1 0,1 0,1 0,1 0,1 0,1 0,1 0,1 0,1 0,1 0,108
"ES 4" 1.266,116 1.203,418 1.405,572 1.280,979 1.304,583 1.478,137
1.353,412 1.276,197 1.277,332 1.468,338 1.332,849 1.440,237

¿Cual sería la mejor forma de importarlos y que queden convertidos en
una matriz numérica?

Saludos,

Sebastián.

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Problem with filling dataframe's column

2023-06-13 Thread Rui Barradas

Às 17:18 de 13/06/2023, javad bayat escreveu:

Dear Rui;
Hi. I used your codes, but it seems it didn't work for me.


pat <- c("_esmdes|_Des Section|0")
dim(data2)

 [1]  281549  9

grep(pat, data2$Layer)
dim(data2)

 [1]  281549  9

What does grep function do? I expected the function to remove 3 rows of the
dataframe.
I do not know the reason.






On Mon, Jun 12, 2023 at 5:16 PM Rui Barradas  wrote:


Às 23:13 de 12/06/2023, javad bayat escreveu:

Dear Rui;
Many thanks for the email. I tried your codes and found that the length

of

the "Values" and "Names" vectors must be equal, otherwise the results

will

not be useful.
For some of the characters in the Layer column that I do not need to be
filled in the LU column, I used "NA".
But I need to delete some of the rows from the table as they are useless
for me. I tried this code to delete entire rows of the dataframe which
contained these three value in the Layer column: It gave me the following
error.


data3 = data2[-grep(c("_esmdes","_Des Section","0"), data2$Layer),]

   Warning message:
In grep(c("_esmdes", "_Des Section", "0"), data2$Layer) :
argument 'pattern' has length > 1 and only the first element will

be

used


data3 = data2[!grepl(c("_esmdes","_Des Section","0"), data2$Layer),]

  Warning message:
  In grepl(c("_esmdes", "_Des Section", "0"), data2$Layer) :
  argument 'pattern' has length > 1 and only the first element will be
used

How can I do this?
Sincerely










On Sun, Jun 11, 2023 at 5:03 PM Rui Barradas 

wrote:



Às 13:18 de 11/06/2023, Rui Barradas escreveu:

Às 22:54 de 11/06/2023, javad bayat escreveu:

Dear Rui;
Many thanks for your email. I used one of your codes,
"data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it works
correctly for me.
Actually I need to expand the codes so as to consider all "Levels" in

the

"Layer" column. There are more than hundred levels in the Layer

column.

If I use your provided code, I have to write it hundred of time as

below:

data2$LU[which(data2$Layer == "Level 1")] <- "Park";
data2$LU[which(data2$Layer == "Level 2")] <- "Agri";
...
...
...
.
Is there any other way to expand the code in order to consider all of

the

levels simultaneously? Like the below code:
data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3", ...))]

<-

c("Park", "Agri", "GS", ...)


Sincerely




On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas 
wrote:


Às 21:05 de 11/06/2023, javad bayat escreveu:

Dear R users;
I am trying to fill a column based on a specific value in another
column

of

a dataframe, but it seems there is a problem with the codes!
The "Layer" and the "LU" are two different columns of the dataframe.
How can I fix this?
Sincerely


for (i in 1:nrow(data2$Layer)){
  if (data2$Layer == "Level 12") {
  data2$LU == "Park"
  }
  }





Hello,

There are two bugs in your code,

1) the index i is not used in the loop
2) the assignment operator is `<-`, not `==`


Here is the loop corrected.

for (i in 1:nrow(data2$Layer)){
  if (data2$Layer[i] == "Level 12") {
data2$LU[i] <- "Park"
  }
}



But R is a vectorized language, the following two ways are the

idiomac

ways of doing what you want to do.



i <- data2$Layer == "Level 12"
data2$LU[i] <- "Park"

# equivalent one-liner
data2$LU[data2$Layer == "Level 12"] <- "Park"



If there are NA's in data2$Layer it's probably safer to use ?which()

in

the logical index, to have a numeric one.



i <- which(data2$Layer == "Level 12")
data2$LU[i] <- "Park"

# equivalent one-liner
data2$LU[which(data2$Layer == "Level 12")] <- "Park"


Hope this helps,

Rui Barradas





Hello,

You don't need to repeat the same instruction 100+ times, there is a

way

of assigning all new LU values at the same time with match().
This assumes that you have the new values in a vector.


Sorry, this is not clear. I mean


This assumes that you have the new values in a vector, the vector Names
below. The vector of values to be matched is created from the data.


Rui Barradas




Values <- sort(unique(data2$Layer))
Names <- c("Park", "Agri", "GS")

i <- match(data2$Layer, Values)
data2$LU <- Names[i]


Hope this helps,

Rui Barradas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






Hello,

Please cc the r-help list, R-Help is threaded and this can in the future
be helpful to others.

You can combine several patters like this:


pat <- c("_esmdes|_Des Section|0")
grep(pat, data2$Layer)

or, programatically,


pat <- paste(c("_esmdes","_Des Section","0"), collapse = "|")


Hope this helps,

Rui Barradas





Hello,

I only posted a corrected grep statement, the complete code should be


pat <- 

Re: [R] Problem with filling dataframe's column

2023-06-13 Thread javad bayat
Dear all;
I used these codes and I get what I wanted.
Sincerely

pat = c("Level 12","Level 22","0")
data3 = data2[-which(data2$Layer == pat),]
dim(data2)
[1] 281549  9
dim(data3)
[1] 244075  9

On Tue, Jun 13, 2023 at 11:36 AM Eric Berger  wrote:

> Hi Javed,
> grep returns the positions of the matches. See an example below.
>
> > v <- c("abc", "bcd", "def")
> > v
> [1] "abc" "bcd" "def"
> > grep("cd",v)
> [1] 2
> > w <- v[-grep("cd",v)]
> > w
> [1] "abc" "def"
> >
>
>
> On Tue, Jun 13, 2023 at 8:50 AM javad bayat  wrote:
> >
> > Dear Rui;
> > Hi. I used your codes, but it seems it didn't work for me.
> >
> > > pat <- c("_esmdes|_Des Section|0")
> > > dim(data2)
> > [1]  281549  9
> > > grep(pat, data2$Layer)
> > > dim(data2)
> > [1]  281549  9
> >
> > What does grep function do? I expected the function to remove 3 rows of
> the
> > dataframe.
> > I do not know the reason.
> >
> >
> >
> >
> >
> >
> > On Mon, Jun 12, 2023 at 5:16 PM Rui Barradas 
> wrote:
> >
> > > Às 23:13 de 12/06/2023, javad bayat escreveu:
> > > > Dear Rui;
> > > > Many thanks for the email. I tried your codes and found that the
> length
> > > of
> > > > the "Values" and "Names" vectors must be equal, otherwise the results
> > > will
> > > > not be useful.
> > > > For some of the characters in the Layer column that I do not need to
> be
> > > > filled in the LU column, I used "NA".
> > > > But I need to delete some of the rows from the table as they are
> useless
> > > > for me. I tried this code to delete entire rows of the dataframe
> which
> > > > contained these three value in the Layer column: It gave me the
> following
> > > > error.
> > > >
> > > >> data3 = data2[-grep(c("_esmdes","_Des Section","0"), data2$Layer),]
> > > >   Warning message:
> > > >In grep(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> > > >argument 'pattern' has length > 1 and only the first element
> will
> > > be
> > > > used
> > > >
> > > >> data3 = data2[!grepl(c("_esmdes","_Des Section","0"), data2$Layer),]
> > > >  Warning message:
> > > >  In grepl(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> > > >  argument 'pattern' has length > 1 and only the first element
> will be
> > > > used
> > > >
> > > > How can I do this?
> > > > Sincerely
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Sun, Jun 11, 2023 at 5:03 PM Rui Barradas 
> > > wrote:
> > > >
> > > >> Às 13:18 de 11/06/2023, Rui Barradas escreveu:
> > > >>> Às 22:54 de 11/06/2023, javad bayat escreveu:
> > >  Dear Rui;
> > >  Many thanks for your email. I used one of your codes,
> > >  "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it
> works
> > >  correctly for me.
> > >  Actually I need to expand the codes so as to consider all
> "Levels" in
> > > >> the
> > >  "Layer" column. There are more than hundred levels in the Layer
> > > column.
> > >  If I use your provided code, I have to write it hundred of time as
> > > >> below:
> > >  data2$LU[which(data2$Layer == "Level 1")] <- "Park";
> > >  data2$LU[which(data2$Layer == "Level 2")] <- "Agri";
> > >  ...
> > >  ...
> > >  ...
> > >  .
> > >  Is there any other way to expand the code in order to consider
> all of
> > > >> the
> > >  levels simultaneously? Like the below code:
> > >  data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3",
> ...))]
> > > >> <-
> > >  c("Park", "Agri", "GS", ...)
> > > 
> > > 
> > >  Sincerely
> > > 
> > > 
> > > 
> > > 
> > >  On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas <
> ruipbarra...@sapo.pt>
> > >  wrote:
> > > 
> > > > Às 21:05 de 11/06/2023, javad bayat escreveu:
> > > >> Dear R users;
> > > >> I am trying to fill a column based on a specific value in
> another
> > > >> column
> > > > of
> > > >> a dataframe, but it seems there is a problem with the codes!
> > > >> The "Layer" and the "LU" are two different columns of the
> dataframe.
> > > >> How can I fix this?
> > > >> Sincerely
> > > >>
> > > >>
> > > >> for (i in 1:nrow(data2$Layer)){
> > > >>  if (data2$Layer == "Level 12") {
> > > >>  data2$LU == "Park"
> > > >>  }
> > > >>  }
> > > >>
> > > >>
> > > >>
> > > >>
> > > > Hello,
> > > >
> > > > There are two bugs in your code,
> > > >
> > > > 1) the index i is not used in the loop
> > > > 2) the assignment operator is `<-`, not `==`
> > > >
> > > >
> > > > Here is the loop corrected.
> > > >
> > > > for (i in 1:nrow(data2$Layer)){
> > > >  if (data2$Layer[i] == "Level 12") {
> > > >data2$LU[i] <- "Park"
> > > >  }
> > > > }
> > > >
> > > >
> > > >
> > > > But R is a vectorized language, the following two ways are the
> > > idiomac
> > > > ways 

Re: [R] Problem with filling dataframe's column

2023-06-13 Thread Eric Berger
Hi Javed,
grep returns the positions of the matches. See an example below.

> v <- c("abc", "bcd", "def")
> v
[1] "abc" "bcd" "def"
> grep("cd",v)
[1] 2
> w <- v[-grep("cd",v)]
> w
[1] "abc" "def"
>


On Tue, Jun 13, 2023 at 8:50 AM javad bayat  wrote:
>
> Dear Rui;
> Hi. I used your codes, but it seems it didn't work for me.
>
> > pat <- c("_esmdes|_Des Section|0")
> > dim(data2)
> [1]  281549  9
> > grep(pat, data2$Layer)
> > dim(data2)
> [1]  281549  9
>
> What does grep function do? I expected the function to remove 3 rows of the
> dataframe.
> I do not know the reason.
>
>
>
>
>
>
> On Mon, Jun 12, 2023 at 5:16 PM Rui Barradas  wrote:
>
> > Às 23:13 de 12/06/2023, javad bayat escreveu:
> > > Dear Rui;
> > > Many thanks for the email. I tried your codes and found that the length
> > of
> > > the "Values" and "Names" vectors must be equal, otherwise the results
> > will
> > > not be useful.
> > > For some of the characters in the Layer column that I do not need to be
> > > filled in the LU column, I used "NA".
> > > But I need to delete some of the rows from the table as they are useless
> > > for me. I tried this code to delete entire rows of the dataframe which
> > > contained these three value in the Layer column: It gave me the following
> > > error.
> > >
> > >> data3 = data2[-grep(c("_esmdes","_Des Section","0"), data2$Layer),]
> > >   Warning message:
> > >In grep(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> > >argument 'pattern' has length > 1 and only the first element will
> > be
> > > used
> > >
> > >> data3 = data2[!grepl(c("_esmdes","_Des Section","0"), data2$Layer),]
> > >  Warning message:
> > >  In grepl(c("_esmdes", "_Des Section", "0"), data2$Layer) :
> > >  argument 'pattern' has length > 1 and only the first element will be
> > > used
> > >
> > > How can I do this?
> > > Sincerely
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Sun, Jun 11, 2023 at 5:03 PM Rui Barradas 
> > wrote:
> > >
> > >> Às 13:18 de 11/06/2023, Rui Barradas escreveu:
> > >>> Às 22:54 de 11/06/2023, javad bayat escreveu:
> >  Dear Rui;
> >  Many thanks for your email. I used one of your codes,
> >  "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it works
> >  correctly for me.
> >  Actually I need to expand the codes so as to consider all "Levels" in
> > >> the
> >  "Layer" column. There are more than hundred levels in the Layer
> > column.
> >  If I use your provided code, I have to write it hundred of time as
> > >> below:
> >  data2$LU[which(data2$Layer == "Level 1")] <- "Park";
> >  data2$LU[which(data2$Layer == "Level 2")] <- "Agri";
> >  ...
> >  ...
> >  ...
> >  .
> >  Is there any other way to expand the code in order to consider all of
> > >> the
> >  levels simultaneously? Like the below code:
> >  data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3", ...))]
> > >> <-
> >  c("Park", "Agri", "GS", ...)
> > 
> > 
> >  Sincerely
> > 
> > 
> > 
> > 
> >  On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas 
> >  wrote:
> > 
> > > Às 21:05 de 11/06/2023, javad bayat escreveu:
> > >> Dear R users;
> > >> I am trying to fill a column based on a specific value in another
> > >> column
> > > of
> > >> a dataframe, but it seems there is a problem with the codes!
> > >> The "Layer" and the "LU" are two different columns of the dataframe.
> > >> How can I fix this?
> > >> Sincerely
> > >>
> > >>
> > >> for (i in 1:nrow(data2$Layer)){
> > >>  if (data2$Layer == "Level 12") {
> > >>  data2$LU == "Park"
> > >>  }
> > >>  }
> > >>
> > >>
> > >>
> > >>
> > > Hello,
> > >
> > > There are two bugs in your code,
> > >
> > > 1) the index i is not used in the loop
> > > 2) the assignment operator is `<-`, not `==`
> > >
> > >
> > > Here is the loop corrected.
> > >
> > > for (i in 1:nrow(data2$Layer)){
> > >  if (data2$Layer[i] == "Level 12") {
> > >data2$LU[i] <- "Park"
> > >  }
> > > }
> > >
> > >
> > >
> > > But R is a vectorized language, the following two ways are the
> > idiomac
> > > ways of doing what you want to do.
> > >
> > >
> > >
> > > i <- data2$Layer == "Level 12"
> > > data2$LU[i] <- "Park"
> > >
> > > # equivalent one-liner
> > > data2$LU[data2$Layer == "Level 12"] <- "Park"
> > >
> > >
> > >
> > > If there are NA's in data2$Layer it's probably safer to use ?which()
> > in
> > > the logical index, to have a numeric one.
> > >
> > >
> > >
> > > i <- which(data2$Layer == "Level 12")
> > > data2$LU[i] <- "Park"
> > >
> > > # equivalent one-liner
> > > data2$LU[which(data2$Layer == "Level 12")] <- "Park"