[R] problem with switching windows in RSelenium..

2021-12-22 Thread akshay kulkarni
dear Kim,
Hope you are doing well.

I am Akshay, from bengaluru, INDIA. I am stock trader and am using R for my 
research. More specifically, I am using RSelenium to scrape news articles. I am 
stuck in the problem related to RSelenium.

I am not able to switch windows in Rselenium 1.7.7 ( I am using chrome) . My 
situation is exactly as described in this link: 
https://github.com/ropensci/RSelenium/issues/143
I also referred to this link:   https://github.com/ropensci/RSelenium/issues/205
I have three questions:


  1.  Please refer to the second link. I ran the following command as suggested 
in that link:
  2.  > binman::list_versions("chromedriver")
$win32
[1] "96.0.4664.45" "97.0.4692.20" "97.0.4692.36"

  I am currently using the first option. Will i be lucky if I switch to 
either the second or the third option? Or to any other version? Also, I suppose 
the versions are for 32 bit($win32 above). Again, will I be lucky if I switch 
to 64bit versions? If yes, how do you switch to 64 bit versions? (I am using 
AWS EC2 windows instance which is 64 bit system)

2. Please refer to the first link. I have read in the comments that the 
myswitch function works in these cases. The solution was presented in 2017. 
Will myswitch still be valid in December 2021? If not, can you please give me a 
modified version?

3. My Rselenium session gets terminated after some period of 
inactivity. How can I change that?

Your help will be highly appreciated.

Thanking You,
Yours sincerely,
AKSHAY M KULKARNI


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] able to estimate in the excel but not in R, any suggestion?

2021-12-22 Thread Marna Wagley
Dear Jim,
Thank you very much for the help. The code seems to be right. Using this
code, I got exactly the same value as the excel's value.
This is great.
Thanks
MW

On Wed, Dec 22, 2021 at 10:57 PM jim holtman  wrote:

> You need to use the 'ifelse' function.  I think I copied down your
> formula and here is the output:
>
> > daT<-structure(list(sd = c(0.481, 0.682, 0.741, 0.394, 0.2, 0.655,
> 0.375),
> + mcd = c(51.305, 51.284, 51.249, 51.2, 51.137, 51.059, 50.968), ca =
> + c(49.313, 69.985, 75.914, 40.303, 20.493, 66.905,38.185)), class =
> + "data.frame", row.names = c(NA, -7L))
> > head(daT)
>  sdmcd ca
> 1 0.481 51.305 49.313
> 2 0.682 51.284 69.985
> 3 0.741 51.249 75.914
> 4 0.394 51.200 40.303
> 5 0.200 51.137 20.493
> 6 0.655 51.059 66.905
> >
> > # add in a new column with the calculation
> >
> > daT$ca_1 <- with(daT,
> +ifelse(sd > mcd * 2,
> +   pi * mcd ^ 2,
> +   (0.5 * sd) * sqrt(mcd^2 - (0.5 * sd)^2) +
> +   mcd^2 * asin((0.5 * sd) / (mcd)) * 2
> +   )
> + )
> >
> > daT
>  sdmcd ca ca_1
> 1 0.481 51.305 49.313 37.01651
> 2 0.682 51.284 69.985 52.46340
> 3 0.741 51.249 75.914 56.96310
> 4 0.394 51.200 40.303 30.25918
> 5 0.200 51.137 20.493 15.34110
> 6 0.655 51.059 66.905 50.16535
> 7 0.375 50.968 38.185 28.66948
> >
>
>
> Thanks
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Wed, Dec 22, 2021 at 10:23 PM Marna Wagley 
> wrote:
> >
> > Hi R users,
> > I was trying to estimate some values in r but could not figure out how to
> > write the script in r. Although I was able to estimate it correctly in
> the
> > excel. For example I have the following data set.
> >
> > daT<-structure(list(sd = c(0.481, 0.682, 0.741, 0.394, 0.2, 0.655,
> 0.375),
> > mcd = c(51.305, 51.284, 51.249, 51.2, 51.137, 51.059, 50.968), ca =
> > c(49.313, 69.985, 75.914, 40.303, 20.493, 66.905,38.185)), class =
> > "data.frame", row.names = c(NA, -7L))
> > head(daT)
> >
> > In this data set, I need to estimate in the column name "ca", In the
> excel
> > I estimated the value using the following formula:
> >
> IF(A2>B2*2,PI()*B2^2,((0.5*A2)*SQRT(B2^2-(0.5*A2)^2)+B2^2*ASIN((0.5*A2)/B2))*2)
> >
> > But when I wrote the following code in the R, it did not work
> > attach(daT)
> >
> daT$ca<-if(sd>mcd*2,pi()*mcd^2,((0.5*sd)*sqrt(mcd^2-(0.5*sd)^2)+mcd^2*asin((0.5*sd)/mcd))*2)
> >
> > Your suggestion would be highly appreciated.
> >
> > Sincerely,
> >
> > MW
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] able to estimate in the excel but not in R, any suggestion?

2021-12-22 Thread jim holtman
You need to use the 'ifelse' function.  I think I copied down your
formula and here is the output:

> daT<-structure(list(sd = c(0.481, 0.682, 0.741, 0.394, 0.2, 0.655, 0.375),
+ mcd = c(51.305, 51.284, 51.249, 51.2, 51.137, 51.059, 50.968), ca =
+ c(49.313, 69.985, 75.914, 40.303, 20.493, 66.905,38.185)), class =
+ "data.frame", row.names = c(NA, -7L))
> head(daT)
 sdmcd ca
1 0.481 51.305 49.313
2 0.682 51.284 69.985
3 0.741 51.249 75.914
4 0.394 51.200 40.303
5 0.200 51.137 20.493
6 0.655 51.059 66.905
>
> # add in a new column with the calculation
>
> daT$ca_1 <- with(daT,
+ifelse(sd > mcd * 2,
+   pi * mcd ^ 2,
+   (0.5 * sd) * sqrt(mcd^2 - (0.5 * sd)^2) +
+   mcd^2 * asin((0.5 * sd) / (mcd)) * 2
+   )
+ )
>
> daT
 sdmcd ca ca_1
1 0.481 51.305 49.313 37.01651
2 0.682 51.284 69.985 52.46340
3 0.741 51.249 75.914 56.96310
4 0.394 51.200 40.303 30.25918
5 0.200 51.137 20.493 15.34110
6 0.655 51.059 66.905 50.16535
7 0.375 50.968 38.185 28.66948
>


Thanks

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Wed, Dec 22, 2021 at 10:23 PM Marna Wagley  wrote:
>
> Hi R users,
> I was trying to estimate some values in r but could not figure out how to
> write the script in r. Although I was able to estimate it correctly in the
> excel. For example I have the following data set.
>
> daT<-structure(list(sd = c(0.481, 0.682, 0.741, 0.394, 0.2, 0.655, 0.375),
> mcd = c(51.305, 51.284, 51.249, 51.2, 51.137, 51.059, 50.968), ca =
> c(49.313, 69.985, 75.914, 40.303, 20.493, 66.905,38.185)), class =
> "data.frame", row.names = c(NA, -7L))
> head(daT)
>
> In this data set, I need to estimate in the column name "ca", In the excel
> I estimated the value using the following formula:
> IF(A2>B2*2,PI()*B2^2,((0.5*A2)*SQRT(B2^2-(0.5*A2)^2)+B2^2*ASIN((0.5*A2)/B2))*2)
>
> But when I wrote the following code in the R, it did not work
> attach(daT)
> daT$ca<-if(sd>mcd*2,pi()*mcd^2,((0.5*sd)*sqrt(mcd^2-(0.5*sd)^2)+mcd^2*asin((0.5*sd)/mcd))*2)
>
> Your suggestion would be highly appreciated.
>
> Sincerely,
>
> MW
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] able to estimate in the excel but not in R, any suggestion?

2021-12-22 Thread Marna Wagley
Hi R users,
I was trying to estimate some values in r but could not figure out how to
write the script in r. Although I was able to estimate it correctly in the
excel. For example I have the following data set.

daT<-structure(list(sd = c(0.481, 0.682, 0.741, 0.394, 0.2, 0.655, 0.375),
mcd = c(51.305, 51.284, 51.249, 51.2, 51.137, 51.059, 50.968), ca =
c(49.313, 69.985, 75.914, 40.303, 20.493, 66.905,38.185)), class =
"data.frame", row.names = c(NA, -7L))
head(daT)

In this data set, I need to estimate in the column name "ca", In the excel
I estimated the value using the following formula:
IF(A2>B2*2,PI()*B2^2,((0.5*A2)*SQRT(B2^2-(0.5*A2)^2)+B2^2*ASIN((0.5*A2)/B2))*2)

But when I wrote the following code in the R, it did not work
attach(daT)
daT$ca<-if(sd>mcd*2,pi()*mcd^2,((0.5*sd)*sqrt(mcd^2-(0.5*sd)^2)+mcd^2*asin((0.5*sd)/mcd))*2)

Your suggestion would be highly appreciated.

Sincerely,

MW

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speed up studentized confidence intervals ?

2021-12-22 Thread David Winsemius
I’m wondering if this is an X-Y problem. (A request to do X when the real 
problem should be doing Y. ) You haven’t explained the goals in natural or 
mathematical language which is leaving me to wonder why you are doing either 
sampling or replication (much less doing both within each iteration in the the 
function given to boot. )

— 
David

Sent from my iPhone

> On Dec 19, 2021, at 3:50 AM, varin sacha via R-help  
> wrote:
> 
> Dear R-experts,
> 
> Here below my R code working but really really slowly ! I need 2 hours with 
> my computer to finally get an answer ! Is there a way to improve my R code to 
> speed it up ? At least to win 1 hour ;=)
> 
> Many thanks
> 
> 
> library(boot)
> 
> s<- sample(178:798, 10, replace=TRUE)
> mean(s)
> 
> N <- 1000
> out <- replicate(N, {
> a<- sample(s,size=5)
> mean(a)
> dat<-data.frame(a)
> 
> med<-function(d,i) {
> temp<-d[i,]
> f<-mean(temp)
> g<-var(replicate(50,mean(sample(temp,replace=T
> return(c(f,g))
> 
> }
> 
>   boot.out <- boot(data = dat, statistic = med, R = 1)
>   boot.ci(boot.out, type = "stud")$stud[, 4:5]
> })
> mean(out[1,] < mean(s) & mean(s) < out[2,]) 
> 
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating NA equivalent

2021-12-22 Thread John Dougherty via R-help
On Tue, 21 Dec 2021 05:41:31 +0100
Marc Girondot via R-help  wrote:

> Dear members,
> 
> I work about dosage and some values are bellow the detection limit. I 
> would like create new "numbers" like LDL (to represent lower than 
> detection limit) and UDL (upper the detection limit) that behave like 
> NA, with the possibility to test them using for example is.LDL() or 
> is.UDL().
> 
> Note that NA is not the same than LDL or UDL: NA represent missing
> data. Here the data is available as LDL or UDL.
> 
> NA is built in R language very deep... any option to create new
> version of NA-equivalent ?
> 
> Thanks
> 
> Marc

You are concerned with a distinct quality in the data with respect to a
specific method. You might want to code a qualitative variable that
defines the detectability state of the specific reading.  Then filter on
the state of interest, and as a means of establishing the quality of the
method or the data, summarize the detection properties in your sample
for the anaytical method employed. I had an engineer tell me flatly
that the measures claimed in a paper were "impossible."  The method
used was already common, but his system was not sensitive enough.  

As far as the statistical properties go, there are measures that could
be made and measures that could not be made.  If a different method
became available, you would probably still desire to either reanalyze
the older data employing the new method, or append new measures where
they were previously unavailable. Either way you encounter
data range or compatibility issues that have to be addressed
methodologically.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop question in R

2021-12-22 Thread Kai Yang via R-help
 Hi Rui and Ivan,Thank you explain of the code for me in detail. This is very 
helpful. And the code works well now.Happy Holiday,Kai
On Wednesday, December 22, 2021, 02:30:49 PM PST, Rui Barradas 
 wrote:  
 
 Hello,

y[i] and c[i] are character strings, they are not variables of data set mpg.
To get the variables, use, well, help("get").

Note that I have changed the temp dir to mine. So I created a variable 
to hold the value


tmpdir <- "c:/temp/"

for (i in seq(nrow(mac))){
  mpg %>%
    filter(hwy < 35) %>%
    ggplot(aes(x = displ, y = get(y[i]), color = get(c[i]))) +
    geom_point() +
    ylab(y[i]) +
    guides(color = guide_legend(title = c[i]))
  ggsave(
    paste0(tmpdir, f[i], ".jpg"),
    width = 9,
    height = 6,
    dpi = 1200,
    units = "in")
}



Like Ivan said, don't rely on auto print. In order to have to open the 
graphics files output by the loop I would have done something like the 
following.

First create a list to hold the plots. Inside the for loop save the 
plots in the list and explicitly print them. And use ggsave argument 
plot. Like this, after the loop you can see what you have by printing 
each list member.


p <- vector("list", length = nrow(mac))
for (i in seq(nrow(mac))){
  mpg %>%
    filter(hwy < 35) %>%
    ggplot(aes(x = displ, y = get(y[i]), color = get(c[i]))) +
    geom_point() +
    ylab(y[i]) +
    guides(color = guide_legend(title = c[i])) -> p[[i]]
  ggsave(
    paste0(tmpdir, f[i], ".jpg"),
    plot = p[[i]],
    width = 9,
    height = 6,
    dpi = 1200,
    units = "in")
}

# See the first plot
p[[1]]


Hope this helps,

Rui Barradas

Às 18:18 de 22/12/21, Kai Yang via R-help escreveu:
>  Hello Eric, Jim and Ivan,
> Many thanks all of your help. I'm a new one in R area. I may not fully 
> understand the idea from you.  I modified my code below, I can get the plots 
> out with correct file name, but plots  are not using correct fields' name. it 
> use y[i], and c[i] as variables' name, does not use hwy, cyl or cty, class in 
> ggplot statement. And there is not any error message. Could you please look 
> into my modified code below and let me know how to modify y= y[i], color = 
> c[i] part?
> Thanks,
> Kai
> 
> y <- c("hwy","cty")
> c <- c("cyl","class")
> f <- c("hwy_cyl","cty_class")
> mac <- data.frame(y,c,f)
> for (i in seq(nrow(mac))){
>    mpg %>%
>      filter(hwy <35) %>%
>      ggplot(aes(x = displ, y = y[i], color = c[i])) +
>      geom_point()
>    ggsave(paste0("c:/temp/",f[i],".jpg"),width = 9, height = 6, dpi = 1200, 
>units = "in")
> }
> 
>      On Wednesday, December 22, 2021, 09:42:45 AM PST, Ivan Krylov 
> wrote:
>  
>  On Wed, 22 Dec 2021 16:58:18 + (UTC)
> Kai Yang via R-help  wrote:
> 
>> mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, y = y[i],
>> color = c[i])) +     geom_point()
> 
> Your code relies on R's auto-printing, where each line of code executed
> at the top level (not in loops or functions) is run as if it was
> wrapped in print(...the rest of the line...).
> 
> Solution: make that print() explicit.
> 
> A better solution: explicitly pass the plot object returned by the
> ggplot functions to the ggsave() function instead of relying on the
> global state of the program.
> 
>> ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units =
>> "in")
> 
> When you type "c:/temp/f[i].jpg", what do you get in return?
> 
> Use paste0() or sprintf() to compose strings out of parts.
> 
>>      [[alternative HTML version deleted]]
> 
> P.S. Please compose your messages in plain text, not HTML. See the
> R-help posting guide for more info.
> 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop question in R

2021-12-22 Thread Rui Barradas

Hello,

There's a stupid typo in my previous post. Inline

Às 22:30 de 22/12/21, Rui Barradas escreveu:

Hello,

y[i] and c[i] are character strings, they are not variables of data set 
mpg.

To get the variables, use, well, help("get").

Note that I have changed the temp dir to mine. So I created a variable 
to hold the value



tmpdir <- "c:/temp/"

for (i in seq(nrow(mac))){
   mpg %>%
     filter(hwy < 35) %>%
     ggplot(aes(x = displ, y = get(y[i]), color = get(c[i]))) +
     geom_point() +
     ylab(y[i]) +
     guides(color = guide_legend(title = c[i]))
   ggsave(
     paste0(tmpdir, f[i], ".jpg"),
     width = 9,
     height = 6,
     dpi = 1200,
     units = "in")
}



Like Ivan said, don't rely on auto print. In order to have to open the 


Should read

In order to *avoid* to open the


Rui Barradas


graphics files output by the loop I would have done something like the 
following.


First create a list to hold the plots. Inside the for loop save the 
plots in the list and explicitly print them. And use ggsave argument 
plot. Like this, after the loop you can see what you have by printing 
each list member.



p <- vector("list", length = nrow(mac))
for (i in seq(nrow(mac))){
   mpg %>%
     filter(hwy < 35) %>%
     ggplot(aes(x = displ, y = get(y[i]), color = get(c[i]))) +
     geom_point() +
     ylab(y[i]) +
     guides(color = guide_legend(title = c[i])) -> p[[i]]
   ggsave(
     paste0(tmpdir, f[i], ".jpg"),
     plot = p[[i]],
     width = 9,
     height = 6,
     dpi = 1200,
     units = "in")
}

# See the first plot
p[[1]]


Hope this helps,

Rui Barradas

Às 18:18 de 22/12/21, Kai Yang via R-help escreveu:

  Hello Eric, Jim and Ivan,
Many thanks all of your help. I'm a new one in R area. I may not fully 
understand the idea from you.  I modified my code below, I can get the 
plots out with correct file name, but plots  are not using correct 
fields' name. it use y[i], and c[i] as variables' name, does not use 
hwy, cyl or cty, class in ggplot statement. And there is not any error 
message. Could you please look into my modified code below and let me 
know how to modify y= y[i], color = c[i] part?

Thanks,
Kai

y <- c("hwy","cty")
c <- c("cyl","class")
f <- c("hwy_cyl","cty_class")
mac <- data.frame(y,c,f)
for (i in seq(nrow(mac))){
   mpg %>%
     filter(hwy <35) %>%
     ggplot(aes(x = displ, y = y[i], color = c[i])) +
     geom_point()
   ggsave(paste0("c:/temp/",f[i],".jpg"),width = 9, height = 6, dpi = 
1200, units = "in")

}

 On Wednesday, December 22, 2021, 09:42:45 AM PST, Ivan Krylov 
 wrote:

  On Wed, 22 Dec 2021 16:58:18 + (UTC)
Kai Yang via R-help  wrote:


mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, y = y[i],
color = c[i])) +     geom_point()


Your code relies on R's auto-printing, where each line of code executed
at the top level (not in loops or functions) is run as if it was
wrapped in print(...the rest of the line...).

Solution: make that print() explicit.

A better solution: explicitly pass the plot object returned by the
ggplot functions to the ggsave() function instead of relying on the
global state of the program.


ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units =
"in")


When you type "c:/temp/f[i].jpg", what do you get in return?

Use paste0() or sprintf() to compose strings out of parts.


 [[alternative HTML version deleted]]


P.S. Please compose your messages in plain text, not HTML. See the
R-help posting guide for more info.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Rui Barradas

Hello,

The error is a simple typo, instead of the period after names(Data[,1]), 
it should be a comma.


cat(format(names(Data[,1]), "\n", v1, justify = "right"), sep = "\n")

(And the error message accurately points out where the error is, in 
these cases try to read the instruction more carefully, typos can be 
hard to find.)



Hope this helps,

Rui Barradas

Às 17:59 de 22/12/21, Stephen H. Dawson, DSL via R-help escreveu:

Thanks.

I am pondering label names, not set on one as of yet. I like your 
recommendation.



 > cat(format(names(Data[,1]). "\n", v1, justify = "right"), sep = "\n")
Error: unexpected symbol in "cat(format(names(Data[,1])."
 >

Your proposed syntax has an error.

QUESTION
Can you identify the error and reply with another recommendation, please?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 12:33 PM, Avi Gross via R-help wrote:

Stephen,

Why should there be a column header when you take your data and 
reformat it?


cat(format(v1, justify = "right"), sep = "\n")

The above is no longer your original data structure and has specified 
what you want printed. Your column header and other names associated 
with your original data.frame are stored as attributes that you sort 
of discarded.


The name you want is associated not with v1 but with what you call 
Data[,1] and you can get that name using names(Data[,1]) and put it 
where you want. In your case, if you want the single line above your 
values to have that name, this would do it:


cat(format(names(Data[,1]). "\n", v1, justify = "right"), sep = "\n")

-Original Message-
From: R-help  On Behalf Of Stephen H. 
Dawson, DSL via R-help

Sent: Wednesday, December 22, 2021 12:02 PM
To: Duncan Murdoch ; Rui Barradas 
; Stephen H. Dawson, DSL via R-help 


Subject: Re: [R] Adding SORT to UNIQUE

Data <- read.csv("./input/Source.csv", header=T)
v1 <- sort(unique(Data[, 1]))
cat(format(v1, justify = "right"), sep = "\n")

OK, working with the options you presented. This is the combination 
where I gain the most benefit.


However, there is no listing of a column header with the output of 
this syntax.


  > cat(format(v1, justify = "right"), sep = "\n")
   2
   3
   4
   5
   6
   7
   8
   9
10
  >

NOTE
The output here is correct (unique) based on the entries from the column.

QUESTION
How does one add a text label of something as simple as v1 to the 
vertical output of this syntax, please?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:13 AM, Stephen H. Dawson, DSL via R-help wrote:

OK, now I get what you are suggesting.

Much appreciated.


Kindest Regards,
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:08 AM, Duncan Murdoch wrote:

On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote:

I see.

So, we are talking taking the output into a new dataframe. I was
hoping to have the output rendered on screen without another
dataframe, but I can live with this option it if must occur.

Am I correct the desired vertical output must first go to a dataframe?

No, that's just one option.  The other 3 don't use dataframes.

Duncan Murdoch


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 10:47 AM, Duncan Murdoch wrote:

On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

Both syntax options work to render the correct (unique) output.
However,
the output is rendered as horizontal. What needs to happen to get
the output to render vertical, please?

The result of those expressions is a vector of the same type as the
column, so your question is really about how to get a vector to
print one element per line.

Probably the simplest way is to put the vector in a dataframe (or
matrix, or tibble, depending on which formatting you prefer). For
example,


 v <- c("red", "green", "blue")
 data.frame(v)

    v
1   red
2 green
3  blue

If you want a more minimal display, try


cat(v, sep = "\n")

red
green
blue

or


cat(format(v, justify = "right"), sep = "\n")

    red
green
   blue

If you want this to happen when you auto-print the object, you can
give it a class attribute and write a function to print that class,
e.g.


    class(v) <- "oneperline"

 print.oneperline <- function(x, ...) cat(format(x, justify =

"right"), sep = "\n")

 v

    red
green
   blue

Duncan Murdoch



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. 

Re: [R] for loop question in R

2021-12-22 Thread Rui Barradas

Hello,

y[i] and c[i] are character strings, they are not variables of data set mpg.
To get the variables, use, well, help("get").

Note that I have changed the temp dir to mine. So I created a variable 
to hold the value



tmpdir <- "c:/temp/"

for (i in seq(nrow(mac))){
  mpg %>%
filter(hwy < 35) %>%
ggplot(aes(x = displ, y = get(y[i]), color = get(c[i]))) +
geom_point() +
ylab(y[i]) +
guides(color = guide_legend(title = c[i]))
  ggsave(
paste0(tmpdir, f[i], ".jpg"),
width = 9,
height = 6,
dpi = 1200,
units = "in")
}



Like Ivan said, don't rely on auto print. In order to have to open the 
graphics files output by the loop I would have done something like the 
following.


First create a list to hold the plots. Inside the for loop save the 
plots in the list and explicitly print them. And use ggsave argument 
plot. Like this, after the loop you can see what you have by printing 
each list member.



p <- vector("list", length = nrow(mac))
for (i in seq(nrow(mac))){
  mpg %>%
filter(hwy < 35) %>%
ggplot(aes(x = displ, y = get(y[i]), color = get(c[i]))) +
geom_point() +
ylab(y[i]) +
guides(color = guide_legend(title = c[i])) -> p[[i]]
  ggsave(
paste0(tmpdir, f[i], ".jpg"),
plot = p[[i]],
width = 9,
height = 6,
dpi = 1200,
units = "in")
}

# See the first plot
p[[1]]


Hope this helps,

Rui Barradas

Às 18:18 de 22/12/21, Kai Yang via R-help escreveu:

  Hello Eric, Jim and Ivan,
Many thanks all of your help. I'm a new one in R area. I may not fully 
understand the idea from you.  I modified my code below, I can get the plots 
out with correct file name, but plots  are not using correct fields' name. it 
use y[i], and c[i] as variables' name, does not use hwy, cyl or cty, class in 
ggplot statement. And there is not any error message. Could you please look 
into my modified code below and let me know how to modify y= y[i], color = c[i] 
part?
Thanks,
Kai

y <- c("hwy","cty")
c <- c("cyl","class")
f <- c("hwy_cyl","cty_class")
mac <- data.frame(y,c,f)
for (i in seq(nrow(mac))){
   mpg %>%
     filter(hwy <35) %>%
     ggplot(aes(x = displ, y = y[i], color = c[i])) +
     geom_point()
   ggsave(paste0("c:/temp/",f[i],".jpg"),width = 9, height = 6, dpi = 1200, units = 
"in")
}

 On Wednesday, December 22, 2021, 09:42:45 AM PST, Ivan Krylov 
 wrote:
  
  On Wed, 22 Dec 2021 16:58:18 + (UTC)

Kai Yang via R-help  wrote:


mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, y = y[i],
color = c[i])) +     geom_point()


Your code relies on R's auto-printing, where each line of code executed
at the top level (not in loops or functions) is run as if it was
wrapped in print(...the rest of the line...).

Solution: make that print() explicit.

A better solution: explicitly pass the plot object returned by the
ggplot functions to the ggsave() function instead of relying on the
global state of the program.


ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units =
"in")


When you type "c:/temp/f[i].jpg", what do you get in return?

Use paste0() or sprintf() to compose strings out of parts.


     [[alternative HTML version deleted]]


P.S. Please compose your messages in plain text, not HTML. See the
R-help posting guide for more info.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop question in R

2021-12-22 Thread Kai Yang via R-help
 strange, I got error message when I run again:
Error: unexpected symbol in:
"    geom_point()
  ggsave"
> }
Error: unexpected '}' in "}"

On Wednesday, December 22, 2021, 10:18:56 AM PST, Kai Yang 
 wrote:  
 
  Hello Eric, Jim and Ivan,
Many thanks all of your help. I'm a new one in R area. I may not fully 
understand the idea from you.  I modified my code below, I can get the plots 
out with correct file name, but plots  are not using correct fields' name. it 
use y[i], and c[i] as variables' name, does not use hwy, cyl or cty, class in 
ggplot statement. And there is not any error message. Could you please look 
into my modified code below and let me know how to modify y= y[i], color = c[i] 
part?
Thanks,
Kai

y <- c("hwy","cty")
c <- c("cyl","class")
f <- c("hwy_cyl","cty_class")
mac <- data.frame(y,c,f)
for (i in seq(nrow(mac))){
  mpg %>%
    filter(hwy <35) %>% 
    ggplot(aes(x = displ, y = y[i], color = c[i])) + 
    geom_point()
  ggsave(paste0("c:/temp/",f[i],".jpg"),width = 9, height = 6, dpi = 1200, 
units = "in")
}

On Wednesday, December 22, 2021, 09:42:45 AM PST, Ivan Krylov 
 wrote:  
 
 On Wed, 22 Dec 2021 16:58:18 + (UTC)
Kai Yang via R-help  wrote:

> mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, y = y[i],
> color = c[i])) +     geom_point()

Your code relies on R's auto-printing, where each line of code executed
at the top level (not in loops or functions) is run as if it was
wrapped in print(...the rest of the line...).

Solution: make that print() explicit.

A better solution: explicitly pass the plot object returned by the
ggplot functions to the ggsave() function instead of relying on the
global state of the program.

> ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units =
> "in")

When you type "c:/temp/f[i].jpg", what do you get in return?

Use paste0() or sprintf() to compose strings out of parts.

>     [[alternative HTML version deleted]]

P.S. Please compose your messages in plain text, not HTML. See the
R-help posting guide for more info.

-- 
Best regards,
Ivan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop question in R

2021-12-22 Thread Kai Yang via R-help
 Hello Eric, Jim and Ivan,
Many thanks all of your help. I'm a new one in R area. I may not fully 
understand the idea from you.  I modified my code below, I can get the plots 
out with correct file name, but plots  are not using correct fields' name. it 
use y[i], and c[i] as variables' name, does not use hwy, cyl or cty, class in 
ggplot statement. And there is not any error message. Could you please look 
into my modified code below and let me know how to modify y= y[i], color = c[i] 
part?
Thanks,
Kai

y <- c("hwy","cty")
c <- c("cyl","class")
f <- c("hwy_cyl","cty_class")
mac <- data.frame(y,c,f)
for (i in seq(nrow(mac))){
  mpg %>%
    filter(hwy <35) %>% 
    ggplot(aes(x = displ, y = y[i], color = c[i])) + 
    geom_point()
  ggsave(paste0("c:/temp/",f[i],".jpg"),width = 9, height = 6, dpi = 1200, 
units = "in")
}

On Wednesday, December 22, 2021, 09:42:45 AM PST, Ivan Krylov 
 wrote:  
 
 On Wed, 22 Dec 2021 16:58:18 + (UTC)
Kai Yang via R-help  wrote:

> mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, y = y[i],
> color = c[i])) +     geom_point()

Your code relies on R's auto-printing, where each line of code executed
at the top level (not in loops or functions) is run as if it was
wrapped in print(...the rest of the line...).

Solution: make that print() explicit.

A better solution: explicitly pass the plot object returned by the
ggplot functions to the ggsave() function instead of relying on the
global state of the program.

> ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units =
> "in")

When you type "c:/temp/f[i].jpg", what do you get in return?

Use paste0() or sprintf() to compose strings out of parts.

>     [[alternative HTML version deleted]]

P.S. Please compose your messages in plain text, not HTML. See the
R-help posting guide for more info.

-- 
Best regards,
Ivan
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Stephen H. Dawson, DSL via R-help

Thanks.

I am pondering label names, not set on one as of yet. I like your 
recommendation.



> cat(format(names(Data[,1]). "\n", v1, justify = "right"), sep = "\n")
Error: unexpected symbol in "cat(format(names(Data[,1])."
>

Your proposed syntax has an error.

QUESTION
Can you identify the error and reply with another recommendation, please?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 12:33 PM, Avi Gross via R-help wrote:

Stephen,

Why should there be a column header when you take your data and reformat it?

cat(format(v1, justify = "right"), sep = "\n")

The above is no longer your original data structure and has specified what you 
want printed. Your column header and other names associated with your original 
data.frame are stored as attributes that you sort of discarded.

The name you want is associated not with v1 but with what you call Data[,1] and 
you can get that name using names(Data[,1]) and put it where you want. In your 
case, if you want the single line above your values to have that name, this 
would do it:

cat(format(names(Data[,1]). "\n", v1, justify = "right"), sep = "\n")

-Original Message-
From: R-help  On Behalf Of Stephen H. Dawson, DSL 
via R-help
Sent: Wednesday, December 22, 2021 12:02 PM
To: Duncan Murdoch ; Rui Barradas ; 
Stephen H. Dawson, DSL via R-help 
Subject: Re: [R] Adding SORT to UNIQUE

Data <- read.csv("./input/Source.csv", header=T)
v1 <- sort(unique(Data[, 1]))
cat(format(v1, justify = "right"), sep = "\n")

OK, working with the options you presented. This is the combination where I 
gain the most benefit.

However, there is no listing of a column header with the output of this syntax.

  > cat(format(v1, justify = "right"), sep = "\n")
   2
   3
   4
   5
   6
   7
   8
   9
10
  >

NOTE
The output here is correct (unique) based on the entries from the column.

QUESTION
How does one add a text label of something as simple as v1 to the vertical 
output of this syntax, please?

*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:13 AM, Stephen H. Dawson, DSL via R-help wrote:

OK, now I get what you are suggesting.

Much appreciated.


Kindest Regards,
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:08 AM, Duncan Murdoch wrote:

On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote:

I see.

So, we are talking taking the output into a new dataframe. I was
hoping to have the output rendered on screen without another
dataframe, but I can live with this option it if must occur.

Am I correct the desired vertical output must first go to a dataframe?

No, that's just one option.  The other 3 don't use dataframes.

Duncan Murdoch


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 10:47 AM, Duncan Murdoch wrote:

On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

Both syntax options work to render the correct (unique) output.
However,
the output is rendered as horizontal. What needs to happen to get
the output to render vertical, please?

The result of those expressions is a vector of the same type as the
column, so your question is really about how to get a vector to
print one element per line.

Probably the simplest way is to put the vector in a dataframe (or
matrix, or tibble, depending on which formatting you prefer). For
example,


 v <- c("red", "green", "blue")
 data.frame(v)

v
1   red
2 green
3  blue

If you want a more minimal display, try


cat(v, sep = "\n")

red
green
blue

or


cat(format(v, justify = "right"), sep = "\n")

red
green
   blue

If you want this to happen when you auto-print the object, you can
give it a class attribute and write a function to print that class,
e.g.


class(v) <- "oneperline"

 print.oneperline <- function(x, ...) cat(format(x, justify =

"right"), sep = "\n")

 v

red
green
   blue

Duncan Murdoch



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

sort(unique(Data[1]))
Error in `[.data.frame`(x, order(x, na.last = na.last,
decreasing =
decreasing)) :
undefined columns selected

That's the wrong syntax:  Data[1] is not "column one of Data".
Use Data[[1]] for that, so

   sort(unique(Data[[1]]))

Actually, I'd probably recommend

 sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list.
Dataframes are 

Re: [R] for loop question in R

2021-12-22 Thread Ivan Krylov
On Wed, 22 Dec 2021 16:58:18 + (UTC)
Kai Yang via R-help  wrote:

> mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, y = y[i],
> color = c[i])) +     geom_point()

Your code relies on R's auto-printing, where each line of code executed
at the top level (not in loops or functions) is run as if it was
wrapped in print(...the rest of the line...).

Solution: make that print() explicit.

A better solution: explicitly pass the plot object returned by the
ggplot functions to the ggsave() function instead of relying on the
global state of the program.

> ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units =
> "in")

When you type "c:/temp/f[i].jpg", what do you get in return?

Use paste0() or sprintf() to compose strings out of parts.

>   [[alternative HTML version deleted]]

P.S. Please compose your messages in plain text, not HTML. See the
R-help posting guide for more info.

-- 
Best regards,
Ivan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Avi Gross via R-help
Stephen,

Why should there be a column header when you take your data and reformat it? 

cat(format(v1, justify = "right"), sep = "\n")

The above is no longer your original data structure and has specified what you 
want printed. Your column header and other names associated with your original 
data.frame are stored as attributes that you sort of discarded.

The name you want is associated not with v1 but with what you call Data[,1] and 
you can get that name using names(Data[,1]) and put it where you want. In your 
case, if you want the single line above your values to have that name, this 
would do it:

cat(format(names(Data[,1]). "\n", v1, justify = "right"), sep = "\n")

-Original Message-
From: R-help  On Behalf Of Stephen H. Dawson, DSL 
via R-help
Sent: Wednesday, December 22, 2021 12:02 PM
To: Duncan Murdoch ; Rui Barradas 
; Stephen H. Dawson, DSL via R-help 
Subject: Re: [R] Adding SORT to UNIQUE

Data <- read.csv("./input/Source.csv", header=T)
v1 <- sort(unique(Data[, 1]))
cat(format(v1, justify = "right"), sep = "\n")

OK, working with the options you presented. This is the combination where I 
gain the most benefit.

However, there is no listing of a column header with the output of this syntax.

 > cat(format(v1, justify = "right"), sep = "\n")
  2
  3
  4
  5
  6
  7
  8
  9
10
 >

NOTE
The output here is correct (unique) based on the entries from the column.

QUESTION
How does one add a text label of something as simple as v1 to the vertical 
output of this syntax, please?

*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:13 AM, Stephen H. Dawson, DSL via R-help wrote:
> OK, now I get what you are suggesting.
>
> Much appreciated.
>
>
> Kindest Regards,
> *Stephen Dawson, DSL*
> /Executive Strategy Consultant/
> Business & Technology
> +1 (865) 804-3454
> http://www.shdawson.com 
>
>
> On 12/22/21 11:08 AM, Duncan Murdoch wrote:
>> On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote:
>>> I see.
>>>
>>> So, we are talking taking the output into a new dataframe. I was 
>>> hoping to have the output rendered on screen without another 
>>> dataframe, but I can live with this option it if must occur.
>>>
>>> Am I correct the desired vertical output must first go to a dataframe?
>>
>> No, that's just one option.  The other 3 don't use dataframes.
>>
>> Duncan Murdoch
>>>
>>>
>>> *Stephen Dawson, DSL*
>>> /Executive Strategy Consultant/
>>> Business & Technology
>>> +1 (865) 804-3454
>>> http://www.shdawson.com 
>>>
>>>
>>> On 12/22/21 10:47 AM, Duncan Murdoch wrote:
 On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:
> Thanks for the reply.
>
> Both syntax options work to render the correct (unique) output. 
> However,
> the output is rendered as horizontal. What needs to happen to get 
> the output to render vertical, please?

 The result of those expressions is a vector of the same type as the 
 column, so your question is really about how to get a vector to 
 print one element per line.

 Probably the simplest way is to put the vector in a dataframe (or 
 matrix, or tibble, depending on which formatting you prefer). For 
 example,

> v <- c("red", "green", "blue")
> data.frame(v)
v
 1   red
 2 green
 3  blue

 If you want a more minimal display, try

> cat(v, sep = "\n")
 red
 green
 blue

 or

> cat(format(v, justify = "right"), sep = "\n")
red
 green
   blue

 If you want this to happen when you auto-print the object, you can 
 give it a class attribute and write a function to print that class, 
 e.g.

>class(v) <- "oneperline"
>
> print.oneperline <- function(x, ...) cat(format(x, justify =
 "right"), sep = "\n")
>
> v
red
 green
   blue

 Duncan Murdoch

>
>
> *Stephen Dawson, DSL*
> /Executive Strategy Consultant/
> Business & Technology
> +1 (865) 804-3454
> http://www.shdawson.com 
>
>
> On 12/21/21 11:38 AM, Duncan Murdoch wrote:
>> On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:
>>> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:
 Thanks for the reply.

 sort(unique(Data[1]))
 Error in `[.data.frame`(x, order(x, na.last = na.last, 
 decreasing =
 decreasing)) :
undefined columns selected
>>>
>>> That's the wrong syntax:  Data[1] is not "column one of Data". 
>>> Use Data[[1]] for that, so
>>>
>>>   sort(unique(Data[[1]]))
>>
>> Actually, I'd probably recommend
>>
>> sort(unique(Data[, 1]))
>>
>> instead.  This treats Data as a matrix rather than as a list.
>> 

Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Duncan Murdoch

On 22/12/2021 12:01 p.m., Stephen H. Dawson, DSL wrote:

Data <- read.csv("./input/Source.csv", header=T)
v1 <- sort(unique(Data[, 1]))
cat(format(v1, justify = "right"), sep = "\n")

OK, working with the options you presented. This is the combination
where I gain the most benefit.

However, there is no listing of a column header with the output of this
syntax.

  > cat(format(v1, justify = "right"), sep = "\n")
   2
   3
   4
   5
   6
   7
   8
   9
10
  >

NOTE
The output here is correct (unique) based on the entries from the column.

QUESTION
How does one add a text label of something as simple as v1 to the
vertical output of this syntax, please?


In this case, you'd just put in cat("v1\n") before the given command.

In the general case where you want to get the name of the column from 
the dataframe, I think you'll need to write your own function.  The one 
Rui just posted looks pretty good.  To get it to print without the row 
numbers as in the example above, just change it a little in the header 
and one other line:


print.sortUnique <- function(x, row.names = FALSE, ...){
   n <- max(lengths(x))
   y <- lapply(x, \(.x) c(.x, rep("", n - length(.x
   y <- do.call(cbind.data.frame, y)
   names(y) <- names(x)
   print(y, row.names = row.names, ...)
   invisible(x)
}

This will give

> Data2
 V1 V2 V3 V4
  3  2  2  1
  5  4  3  2
  6  5  4  4
  7  6  5  5
  8  9  6  6
  9 11  8  9
 12 15  9 10
 14 16 11 11
 15 17 14 12
 18 18 15 13
 19 19 17 14
 2019 16
   20 18
  19

with his example data.

Duncan Murdoch




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:13 AM, Stephen H. Dawson, DSL via R-help wrote:

OK, now I get what you are suggesting.

Much appreciated.


Kindest Regards,
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:08 AM, Duncan Murdoch wrote:

On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote:

I see.

So, we are talking taking the output into a new dataframe. I was hoping
to have the output rendered on screen without another dataframe, but I
can live with this option it if must occur.

Am I correct the desired vertical output must first go to a dataframe?


No, that's just one option.  The other 3 don't use dataframes.

Duncan Murdoch



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 10:47 AM, Duncan Murdoch wrote:

On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

Both syntax options work to render the correct (unique) output.
However,
the output is rendered as horizontal. What needs to happen to get the
output to render vertical, please?


The result of those expressions is a vector of the same type as the
column, so your question is really about how to get a vector to print
one element per line.

Probably the simplest way is to put the vector in a dataframe (or
matrix, or tibble, depending on which formatting you prefer). For
example,


     v <- c("red", "green", "blue")
     data.frame(v)

    v
1   red
2 green
3  blue

If you want a more minimal display, try


cat(v, sep = "\n")

red
green
blue

or


cat(format(v, justify = "right"), sep = "\n")

    red
green
   blue

If you want this to happen when you auto-print the object, you can
give it a class attribute and write a function to print that class,
e.g.


    class(v) <- "oneperline"

     print.oneperline <- function(x, ...) cat(format(x, justify =

"right"), sep = "\n")


     v

    red
green
   blue

Duncan Murdoch




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

sort(unique(Data[1]))
Error in `[.data.frame`(x, order(x, na.last = na.last,
decreasing =
decreasing)) :
    undefined columns selected


That's the wrong syntax:  Data[1] is not "column one of Data". Use
Data[[1]] for that, so

   sort(unique(Data[[1]]))


Actually, I'd probably recommend

     sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list.
Dataframes are lists that look like matrices, but to me the matrix
aspect is usually more intuitive.

Duncan Murdoch



I think Rui already pointed out the typo in the quoted text
below...

Duncan Murdoch



The recommended syntax did not work, as listed above.

What I want is the sort of distinct column output. Again, the
column
may
be text or numbers. This is a huge analysis effort with data
coming at
me from many different sources.


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 

Re: [R] for loop question in R

2021-12-22 Thread jim holtman
You may have to add an explicit 'print' to ggplot

library(ggplot2)
library(tidyverse)
y <- c("hwy","cty")
c <- c("cyl","class")
f <- c("hwy_cyl","cty_class")
mac <- data.frame(y,c,f)
for (i in nrow(mac)){
  mpg %>%filter(hwy <35) %>%
 print(ggplot(aes(x = displ, y = y[i], color = c[i])) + geom_point())
  ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units = "in")
}

Thanks

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


Thanks

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Wed, Dec 22, 2021 at 9:08 AM Kai Yang via R-help
 wrote:
>
> Hello R team,I want to use for loop to generate multiple plots with 3 
> parameter, (y is for y axis, c is for color and f is for file name in 
> output). I created a data frame to save the information and use the 
> information in for loop. I use y[i], c[i] and f[i] in the loop, but it seems 
> doesn't work. Can anyone correct my code to make it work?
> Thanks,Kai
>
> library(ggplot2)library(tidyverse)
> y <- c("hwy","cty")c <- c("cyl","class")f <- c("hwy_cyl","cty_class")
> mac <- data.frame(y,c,f)
> for (i in nrow(mac)){  mpg %>%filter(hwy <35) %>% ggplot(aes(x = 
> displ, y = y[i], color = c[i])) + geom_point()  
> ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units = "in")}
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop question in R

2021-12-22 Thread Eric Berger
Try replacing
"c:/temp/f[i].jpg"
with
paste0("c:/temp/",f[i],".jpg")


On Wed, Dec 22, 2021 at 7:08 PM Kai Yang via R-help 
wrote:

> Hello R team,I want to use for loop to generate multiple plots with 3
> parameter, (y is for y axis, c is for color and f is for file name in
> output). I created a data frame to save the information and use the
> information in for loop. I use y[i], c[i] and f[i] in the loop, but it
> seems doesn't work. Can anyone correct my code to make it work?
> Thanks,Kai
>
> library(ggplot2)library(tidyverse)
> y <- c("hwy","cty")c <- c("cyl","class")f <- c("hwy_cyl","cty_class")
> mac <- data.frame(y,c,f)
> for (i in nrow(mac)){  mpg %>%filter(hwy <35) %>% ggplot(aes(x =
> displ, y = y[i], color = c[i])) + geom_point()
> ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units = "in")}
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] for loop question in R

2021-12-22 Thread Andrew Simmons
nrow() is just the numbers of rows in your data frame, use seq_len(nrow())
or seq(nrow()) to loop through all row numbers

On Wed, Dec 22, 2021, 12:08 Kai Yang via R-help 
wrote:

> Hello R team,I want to use for loop to generate multiple plots with 3
> parameter, (y is for y axis, c is for color and f is for file name in
> output). I created a data frame to save the information and use the
> information in for loop. I use y[i], c[i] and f[i] in the loop, but it
> seems doesn't work. Can anyone correct my code to make it work?
> Thanks,Kai
>
> library(ggplot2)library(tidyverse)
> y <- c("hwy","cty")c <- c("cyl","class")f <- c("hwy_cyl","cty_class")
> mac <- data.frame(y,c,f)
> for (i in nrow(mac)){  mpg %>%filter(hwy <35) %>% ggplot(aes(x =
> displ, y = y[i], color = c[i])) + geom_point()
> ggsave("c:/temp/f[i].jpg",width = 9, height = 6, dpi = 1200, units = "in")}
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Stephen H. Dawson, DSL via R-help

Wow! Thanks.

I need to process the logic you have presented next week when I have the 
time to focus. I now need to accomplish some productive work output 
based on what I have now for understandings.



Kindest Regards,
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:57 AM, Rui Barradas wrote:

Hello,

The problem is that the vectors of unique values in each column of the 
original data.frame Data need not be same length. And the output of 
sort(unique(.)) is a list of vectors of different lengths. And lists 
print "horizontally", each vector on its own.


Like Duncan said, one of the ways of getting a vertical display is to 
have the list of sorted, unique values be of a custom class and write 
a print method for that class. Here is an example of this. The 
function to sort outputs an object of a class that sub-classes class 
"list". And a print method takes care of the printing. This method 
creates a temp data.frame, prints that df and invisibly returns its 
input.


# Create a test data set
set.seed(2021)
Data <- replicate(4, as.character(sample(20, 20, TRUE)))
Data <- as.data.frame(Data)


# Now the functions
sort_unique <- function(x){
  y <- lapply(x, \(.x) stringr::str_sort(unique(.x), numeric = TRUE))
  old_class <- class(y)
  class(y) <- c("sortUnique", old_class)
  y
}
print.sortUnique <- function(x, ...){
  n <- max(lengths(x))
  y <- lapply(x, \(.x) c(.x, rep("", n - length(.x
  y <- do.call(cbind.data.frame, y)
  names(y) <- names(x)
  print(y)
  invisible(x)
}

# Test the functions above
Data2 <- sort_unique(Data)

class(Data2)
Data2
print(Data2)


Hope this helps,

Rui Barradas

Às 15:55 de 22/12/21, Stephen H. Dawson, DSL escreveu:

I see.

So, we are talking taking the output into a new dataframe. I was 
hoping to have the output rendered on screen without another 
dataframe, but I can live with this option it if must occur.


Am I correct the desired vertical output must first go to a dataframe?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 10:47 AM, Duncan Murdoch wrote:

On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

Both syntax options work to render the correct (unique) output. 
However,

the output is rendered as horizontal. What needs to happen to get the
output to render vertical, please?


The result of those expressions is a vector of the same type as the 
column, so your question is really about how to get a vector to 
print one element per line.


Probably the simplest way is to put the vector in a dataframe (or 
matrix, or tibble, depending on which formatting you prefer).  For 
example,


>   v <- c("red", "green", "blue")
>   data.frame(v)
  v
1   red
2 green
3  blue

If you want a more minimal display, try

> cat(v, sep = "\n")
red
green
blue

or

> cat(format(v, justify = "right"), sep = "\n")
  red
green
 blue

If you want this to happen when you auto-print the object, you can 
give it a class attribute and write a function to print that class, 
e.g.


>  class(v) <- "oneperline"
>
>   print.oneperline <- function(x, ...) cat(format(x, justify = 
"right"), sep = "\n")

>
>   v
  red
green
 blue

Duncan Murdoch




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

sort(unique(Data[1]))
Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
decreasing)) :
      undefined columns selected


That's the wrong syntax:  Data[1] is not "column one of Data". Use
Data[[1]] for that, so

 sort(unique(Data[[1]]))


Actually, I'd probably recommend

   sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list.
Dataframes are lists that look like matrices, but to me the matrix
aspect is usually more intuitive.

Duncan Murdoch



I think Rui already pointed out the typo in the quoted text below...

Duncan Murdoch



The recommended syntax did not work, as listed above.

What I want is the sort of distinct column output. Again, the 
column

may
be text or numbers. This is a huge analysis effort with data 
coming at

me from many different sources.


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:07 AM, Duncan Murdoch wrote:

On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:

Thanks everyone for the replies.

It is clear one either needs to write a function or put the 
unique

entries into another dataframe.

It seems odd R cannot sort a list of unique column entries 
with ease.


Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Stephen H. Dawson, DSL via R-help

Data <- read.csv("./input/Source.csv", header=T)
v1 <- sort(unique(Data[, 1]))
cat(format(v1, justify = "right"), sep = "\n")

OK, working with the options you presented. This is the combination 
where I gain the most benefit.


However, there is no listing of a column header with the output of this 
syntax.


> cat(format(v1, justify = "right"), sep = "\n")
 2
 3
 4
 5
 6
 7
 8
 9
10
>

NOTE
The output here is correct (unique) based on the entries from the column.

QUESTION
How does one add a text label of something as simple as v1 to the 
vertical output of this syntax, please?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:13 AM, Stephen H. Dawson, DSL via R-help wrote:

OK, now I get what you are suggesting.

Much appreciated.


Kindest Regards,
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:08 AM, Duncan Murdoch wrote:

On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote:

I see.

So, we are talking taking the output into a new dataframe. I was hoping
to have the output rendered on screen without another dataframe, but I
can live with this option it if must occur.

Am I correct the desired vertical output must first go to a dataframe?


No, that's just one option.  The other 3 don't use dataframes.

Duncan Murdoch



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 10:47 AM, Duncan Murdoch wrote:

On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

Both syntax options work to render the correct (unique) output. 
However,

the output is rendered as horizontal. What needs to happen to get the
output to render vertical, please?


The result of those expressions is a vector of the same type as the
column, so your question is really about how to get a vector to print
one element per line.

Probably the simplest way is to put the vector in a dataframe (or
matrix, or tibble, depending on which formatting you prefer). For
example,


    v <- c("red", "green", "blue")
    data.frame(v)

   v
1   red
2 green
3  blue

If you want a more minimal display, try


cat(v, sep = "\n")

red
green
blue

or


cat(format(v, justify = "right"), sep = "\n")

   red
green
  blue

If you want this to happen when you auto-print the object, you can
give it a class attribute and write a function to print that class, 
e.g.



   class(v) <- "oneperline"

    print.oneperline <- function(x, ...) cat(format(x, justify =

"right"), sep = "\n")


    v

   red
green
  blue

Duncan Murdoch




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

sort(unique(Data[1]))
Error in `[.data.frame`(x, order(x, na.last = na.last, 
decreasing =

decreasing)) :
   undefined columns selected


That's the wrong syntax:  Data[1] is not "column one of Data". Use
Data[[1]] for that, so

  sort(unique(Data[[1]]))


Actually, I'd probably recommend

    sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list.
Dataframes are lists that look like matrices, but to me the matrix
aspect is usually more intuitive.

Duncan Murdoch



I think Rui already pointed out the typo in the quoted text 
below...


Duncan Murdoch



The recommended syntax did not work, as listed above.

What I want is the sort of distinct column output. Again, the 
column

may
be text or numbers. This is a huge analysis effort with data
coming at
me from many different sources.


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:07 AM, Duncan Murdoch wrote:
On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help 
wrote:

Thanks everyone for the replies.

It is clear one either needs to write a function or put the 
unique

entries into another dataframe.

It seems odd R cannot sort a list of unique column entries with
ease.
Python and SQL can do it with ease.


I've seen several responses that looked pretty simple. It's 
hard to
beat sort(unique(x)), though there's a fair bit of confusion 
about

what you actually want.  Maybe you should post an example of the
code
you'd use in Python?

Duncan Murdoch



QUESTION
Is there a simpler means than other than the unique function to
capture
distinct column entries, then sort that list?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 5:53 PM, Rui Barradas 

Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Avi Gross via R-help
Stephen,

Understanding a bit better where you are coming from, I come back to how people 
think about things. Languages like R often focus on doing things incrementally. 
I don't mean the language exactly as much as many of the people using the 
language.

So it is perfectly normal to make multiple versions of something as you go 
along and let older versions no longer in use be garbage collected if needed.

Your latest question was how to display your data a certain way. You want it 
written down the screen (or paper) rather than across. Duncan provided you with 
a few of the many ways to do that. Unless you are working with giant amounts of 
data already using up most of your available memory, it is fairly harmless to 
make some temporary copies that get you what you want.

What you may not have known is that some of the systems in R use a concept of 
generic functions. In particular, when you use print() or just put a variable 
name on a line by itself which normally implicitly calls print() for you, it 
does not just magically print but examines what you asked to print and does a 
lookup to see  how to print it. My setup currently has 303 such methods defined 
with names like print.factor and print.Date and print.data.frame that are given 
control to print each kind of object the way you want. So one way to change how 
something is printed is to make it into an object for some class and let the 
system then print it. Of course, you can also design a new class of your own 
and make a print method for it and I suspect someone has done what you want in 
some package.

Changing the class of an existing object, even a large object, is fairly 
inexpensive. Other attributes can also control things like "dim" specifying 
dimensions. Say I have a vector containing 1:5 that I want to print vertically.

> vec <- 1:5

  > vec
  [1] 1 2 3 4 5

You can see it normally prints horizontally. Transposing it might sound like a 
good way to go except R vectors generally do not have the concept. Transposing 
a vector will make a matrix which prints a biut differently but is still 
horizontal, but a second transpose works beter:

  > class(t(vec))
  [1] "matrix" "array" 
 
 > t(vec)
  [,1] [,2] [,3] [,4] [,5]
  [1,]12345
  
> t(t(vec))
  [,1]
  [1,]1
  [2,]2
  [3,]3
  [4,]4
  [5,]5

Of course, you can probably as easily make it a matrix:

  > as.matrix(vec)
  [,1]
  [1,]1
  [2,]2
  [3,]3
  [4,]4
  [5,]5
  
  > matrix(vec, ncol=1)
  [,1]
  [1,]1
  [2,]2
  [3,]3
  [4,]4
  [5,]5

The above made a copy, of course.

You can change the original into a matrix by just changing an attribute:

  > dim(vec) <- c(length(vec), 1)

  > vec
  [,1]
  [1,]1
  [2,]2
  [3,]3
  [4,]4
  [5,]5

  > attributes(vec)
  $dim
  [1] 5 1
  
  > class(vec)
  [1] "matrix" "array"

BUT you need to be careful as in your earlier experience. Some places that 
accept a vector will not accept a 1-column or 1-row matrix, or a data.frame 
with one column or just one row. Best to be careful about mixing.

So look again at what Duncan sent and some are quite nice. You can speficically 
use the cat() command instead of a default print and it has added 
functionality. Various packages exist including some that do various kinds of 
pretty printing.

He left out one of the simplest ones, which is simply to write your own print 
routine such as this loop:

Here you define a trivial one-line function that calls print() multiple times 
to make your output vertical:

vertprint <- function(horiz) for (item in horiz) print(item)

for (item in horiz) print(item)

  [1] 1
  [1] 2
  [1] 3
  [1] 4
  [1] 5

Obviously if you are printing huge amounts of data, this is not necessarily any 
more efficient. But it does not necessarily make many copies of your data if 
that bothers you.

May I end with a suggestion. It can be fun to start a discussion in a place 
like this but it can also be a waste of time for many people, especially those 
who provide longer answers and do some experimenting to illustrate. Often a 
simple search like the following can rapidly get you an answer before feeling 
the need to ask here. I did a simple search just now for what I assumed was a 
very frequent question:

"R how to print data vertically"

I looked at a few of the answers and noted a few other suggestions with one 
similar but different:

cat(paste(x),sep="\n")

And of course various packages that implemented something like print.vertical().

Your earlier statement suggests you may be interested in what is the canonical 
or best way and by now, you may note there are very often MANY ways and some 
programmers prefer one or another. And, I note, after enough questions of a 
fairly basic or even naïve nature, some responders in these groups stop 
responding for some reason.

-Original Message-
From: R-help  On Behalf Of Stephen H. Dawson, DSL 
via R-help
Sent: Wednesday, December 22, 2021 

[R] for loop question in R

2021-12-22 Thread Kai Yang via R-help
Hello R team,I want to use for loop to generate multiple plots with 3 
parameter, (y is for y axis, c is for color and f is for file name in output). 
I created a data frame to save the information and use the information in for 
loop. I use y[i], c[i] and f[i] in the loop, but it seems doesn't work. Can 
anyone correct my code to make it work?
Thanks,Kai

library(ggplot2)library(tidyverse)
y <- c("hwy","cty")c <- c("cyl","class")f <- c("hwy_cyl","cty_class")
mac <- data.frame(y,c,f)
for (i in nrow(mac)){  mpg %>%    filter(hwy <35) %>%     ggplot(aes(x = displ, 
y = y[i], color = c[i])) +     geom_point()  ggsave("c:/temp/f[i].jpg",width = 
9, height = 6, dpi = 1200, units = "in")}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Rui Barradas

Hello,

The problem is that the vectors of unique values in each column of the 
original data.frame Data need not be same length. And the output of 
sort(unique(.)) is a list of vectors of different lengths. And lists 
print "horizontally", each vector on its own.


Like Duncan said, one of the ways of getting a vertical display is to 
have the list of sorted, unique values be of a custom class and write a 
print method for that class. Here is an example of this. The function to 
sort outputs an object of a class that sub-classes class "list". And a 
print method takes care of the printing. This method creates a temp 
data.frame, prints that df and invisibly returns its input.


# Create a test data set
set.seed(2021)
Data <- replicate(4, as.character(sample(20, 20, TRUE)))
Data <- as.data.frame(Data)


# Now the functions
sort_unique <- function(x){
  y <- lapply(x, \(.x) stringr::str_sort(unique(.x), numeric = TRUE))
  old_class <- class(y)
  class(y) <- c("sortUnique", old_class)
  y
}
print.sortUnique <- function(x, ...){
  n <- max(lengths(x))
  y <- lapply(x, \(.x) c(.x, rep("", n - length(.x
  y <- do.call(cbind.data.frame, y)
  names(y) <- names(x)
  print(y)
  invisible(x)
}

# Test the functions above
Data2 <- sort_unique(Data)

class(Data2)
Data2
print(Data2)


Hope this helps,

Rui Barradas

Às 15:55 de 22/12/21, Stephen H. Dawson, DSL escreveu:

I see.

So, we are talking taking the output into a new dataframe. I was hoping 
to have the output rendered on screen without another dataframe, but I 
can live with this option it if must occur.


Am I correct the desired vertical output must first go to a dataframe?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 10:47 AM, Duncan Murdoch wrote:

On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

Both syntax options work to render the correct (unique) output. However,
the output is rendered as horizontal. What needs to happen to get the
output to render vertical, please?


The result of those expressions is a vector of the same type as the 
column, so your question is really about how to get a vector to print 
one element per line.


Probably the simplest way is to put the vector in a dataframe (or 
matrix, or tibble, depending on which formatting you prefer).  For 
example,


>   v <- c("red", "green", "blue")
>   data.frame(v)
  v
1   red
2 green
3  blue

If you want a more minimal display, try

> cat(v, sep = "\n")
red
green
blue

or

> cat(format(v, justify = "right"), sep = "\n")
  red
green
 blue

If you want this to happen when you auto-print the object, you can 
give it a class attribute and write a function to print that class, e.g.


>  class(v) <- "oneperline"
>
>   print.oneperline <- function(x, ...) cat(format(x, justify = 
"right"), sep = "\n")

>
>   v
  red
green
 blue

Duncan Murdoch




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

sort(unique(Data[1]))
Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
decreasing)) :
      undefined columns selected


That's the wrong syntax:  Data[1] is not "column one of Data". Use
Data[[1]] for that, so

 sort(unique(Data[[1]]))


Actually, I'd probably recommend

   sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list.
Dataframes are lists that look like matrices, but to me the matrix
aspect is usually more intuitive.

Duncan Murdoch



I think Rui already pointed out the typo in the quoted text below...

Duncan Murdoch



The recommended syntax did not work, as listed above.

What I want is the sort of distinct column output. Again, the column
may
be text or numbers. This is a huge analysis effort with data 
coming at

me from many different sources.


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:07 AM, Duncan Murdoch wrote:

On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:

Thanks everyone for the replies.

It is clear one either needs to write a function or put the unique
entries into another dataframe.

It seems odd R cannot sort a list of unique column entries with 
ease.

Python and SQL can do it with ease.


I've seen several responses that looked pretty simple. It's hard to
beat sort(unique(x)), though there's a fair bit of confusion about
what you actually want.  Maybe you should post an example of the 
code

you'd use in Python?

Duncan Murdoch



QUESTION
Is there a simpler means than other than the unique function to
capture
distinct column entries, then sort that list?


*Stephen 

Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Stephen H. Dawson, DSL via R-help

OK, now I get what you are suggesting.

Much appreciated.


Kindest Regards,
*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 11:08 AM, Duncan Murdoch wrote:

On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote:

I see.

So, we are talking taking the output into a new dataframe. I was hoping
to have the output rendered on screen without another dataframe, but I
can live with this option it if must occur.

Am I correct the desired vertical output must first go to a dataframe?


No, that's just one option.  The other 3 don't use dataframes.

Duncan Murdoch



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 10:47 AM, Duncan Murdoch wrote:

On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

Both syntax options work to render the correct (unique) output. 
However,

the output is rendered as horizontal. What needs to happen to get the
output to render vertical, please?


The result of those expressions is a vector of the same type as the
column, so your question is really about how to get a vector to print
one element per line.

Probably the simplest way is to put the vector in a dataframe (or
matrix, or tibble, depending on which formatting you prefer). For
example,


    v <- c("red", "green", "blue")
    data.frame(v)

   v
1   red
2 green
3  blue

If you want a more minimal display, try


cat(v, sep = "\n")

red
green
blue

or


cat(format(v, justify = "right"), sep = "\n")

   red
green
  blue

If you want this to happen when you auto-print the object, you can
give it a class attribute and write a function to print that class, 
e.g.



   class(v) <- "oneperline"

    print.oneperline <- function(x, ...) cat(format(x, justify =

"right"), sep = "\n")


    v

   red
green
  blue

Duncan Murdoch




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

sort(unique(Data[1]))
Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
decreasing)) :
   undefined columns selected


That's the wrong syntax:  Data[1] is not "column one of Data". Use
Data[[1]] for that, so

  sort(unique(Data[[1]]))


Actually, I'd probably recommend

    sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list.
Dataframes are lists that look like matrices, but to me the matrix
aspect is usually more intuitive.

Duncan Murdoch



I think Rui already pointed out the typo in the quoted text below...

Duncan Murdoch



The recommended syntax did not work, as listed above.

What I want is the sort of distinct column output. Again, the 
column

may
be text or numbers. This is a huge analysis effort with data
coming at
me from many different sources.


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:07 AM, Duncan Murdoch wrote:

On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:

Thanks everyone for the replies.

It is clear one either needs to write a function or put the 
unique

entries into another dataframe.

It seems odd R cannot sort a list of unique column entries with
ease.
Python and SQL can do it with ease.


I've seen several responses that looked pretty simple. It's 
hard to

beat sort(unique(x)), though there's a fair bit of confusion about
what you actually want.  Maybe you should post an example of the
code
you'd use in Python?

Duncan Murdoch



QUESTION
Is there a simpler means than other than the unique function to
capture
distinct column entries, then sort that list?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 5:53 PM, Rui Barradas wrote:

Hello,

Inline.

Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help 
escreveu:

Thanks.

sort(unique(Data[[1]]))

This syntax provides row numbers, not column values.


This is not right.
The syntax Data[1] extracts a sub-data.frame, the syntax 
Data[[1]]

extracts the column vector.

As for my previous answer, it was not addressing the question, I
misinterpreted it as being a question on how to sort by numeric
order
when the data is not numeric. Here is a, hopefully, complete
answer.
Still with package stringr.


cols_to_sort <- 1:4

Data2 <- lapply(Data[cols_to_sort], \(x){
       stringr::str_sort(unique(x), numeric = TRUE)
})


Or using Avi's suggestion of writing a function to do all the
work and
simplify the lapply loop later,


unisort2 <- function(vec, ...) 

Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Duncan Murdoch

On 22/12/2021 10:55 a.m., Stephen H. Dawson, DSL wrote:

I see.

So, we are talking taking the output into a new dataframe. I was hoping
to have the output rendered on screen without another dataframe, but I
can live with this option it if must occur.

Am I correct the desired vertical output must first go to a dataframe?


No, that's just one option.  The other 3 don't use dataframes.

Duncan Murdoch



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 10:47 AM, Duncan Murdoch wrote:

On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

Both syntax options work to render the correct (unique) output. However,
the output is rendered as horizontal. What needs to happen to get the
output to render vertical, please?


The result of those expressions is a vector of the same type as the
column, so your question is really about how to get a vector to print
one element per line.

Probably the simplest way is to put the vector in a dataframe (or
matrix, or tibble, depending on which formatting you prefer).  For
example,


    v <- c("red", "green", "blue")
    data.frame(v)

   v
1   red
2 green
3  blue

If you want a more minimal display, try


cat(v, sep = "\n")

red
green
blue

or


cat(format(v, justify = "right"), sep = "\n")

   red
green
  blue

If you want this to happen when you auto-print the object, you can
give it a class attribute and write a function to print that class, e.g.


   class(v) <- "oneperline"

    print.oneperline <- function(x, ...) cat(format(x, justify =

"right"), sep = "\n")


    v

   red
green
  blue

Duncan Murdoch




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

sort(unique(Data[1]))
Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
decreasing)) :
       undefined columns selected


That's the wrong syntax:  Data[1] is not "column one of Data". Use
Data[[1]] for that, so

  sort(unique(Data[[1]]))


Actually, I'd probably recommend

    sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list.
Dataframes are lists that look like matrices, but to me the matrix
aspect is usually more intuitive.

Duncan Murdoch



I think Rui already pointed out the typo in the quoted text below...

Duncan Murdoch



The recommended syntax did not work, as listed above.

What I want is the sort of distinct column output. Again, the column
may
be text or numbers. This is a huge analysis effort with data
coming at
me from many different sources.


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:07 AM, Duncan Murdoch wrote:

On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:

Thanks everyone for the replies.

It is clear one either needs to write a function or put the unique
entries into another dataframe.

It seems odd R cannot sort a list of unique column entries with
ease.
Python and SQL can do it with ease.


I've seen several responses that looked pretty simple. It's hard to
beat sort(unique(x)), though there's a fair bit of confusion about
what you actually want.  Maybe you should post an example of the
code
you'd use in Python?

Duncan Murdoch



QUESTION
Is there a simpler means than other than the unique function to
capture
distinct column entries, then sort that list?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 5:53 PM, Rui Barradas wrote:

Hello,

Inline.

Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu:

Thanks.

sort(unique(Data[[1]]))

This syntax provides row numbers, not column values.


This is not right.
The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]]
extracts the column vector.

As for my previous answer, it was not addressing the question, I
misinterpreted it as being a question on how to sort by numeric
order
when the data is not numeric. Here is a, hopefully, complete
answer.
Still with package stringr.


cols_to_sort <- 1:4

Data2 <- lapply(Data[cols_to_sort], \(x){
       stringr::str_sort(unique(x), numeric = TRUE)
})


Or using Avi's suggestion of writing a function to do all the
work and
simplify the lapply loop later,


unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...)
Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE)


Hope this helps,

Rui Barradas




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 

Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Stephen H. Dawson, DSL via R-help

I see.

So, we are talking taking the output into a new dataframe. I was hoping 
to have the output rendered on screen without another dataframe, but I 
can live with this option it if must occur.


Am I correct the desired vertical output must first go to a dataframe?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/22/21 10:47 AM, Duncan Murdoch wrote:

On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

Both syntax options work to render the correct (unique) output. However,
the output is rendered as horizontal. What needs to happen to get the
output to render vertical, please?


The result of those expressions is a vector of the same type as the 
column, so your question is really about how to get a vector to print 
one element per line.


Probably the simplest way is to put the vector in a dataframe (or 
matrix, or tibble, depending on which formatting you prefer).  For 
example,


>   v <- c("red", "green", "blue")
>   data.frame(v)
  v
1   red
2 green
3  blue

If you want a more minimal display, try

> cat(v, sep = "\n")
red
green
blue

or

> cat(format(v, justify = "right"), sep = "\n")
  red
green
 blue

If you want this to happen when you auto-print the object, you can 
give it a class attribute and write a function to print that class, e.g.


>  class(v) <- "oneperline"
>
>   print.oneperline <- function(x, ...) cat(format(x, justify = 
"right"), sep = "\n")

>
>   v
  red
green
 blue

Duncan Murdoch




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

sort(unique(Data[1]))
Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
decreasing)) :
      undefined columns selected


That's the wrong syntax:  Data[1] is not "column one of Data". Use
Data[[1]] for that, so

 sort(unique(Data[[1]]))


Actually, I'd probably recommend

   sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list.
Dataframes are lists that look like matrices, but to me the matrix
aspect is usually more intuitive.

Duncan Murdoch



I think Rui already pointed out the typo in the quoted text below...

Duncan Murdoch



The recommended syntax did not work, as listed above.

What I want is the sort of distinct column output. Again, the column
may
be text or numbers. This is a huge analysis effort with data 
coming at

me from many different sources.


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:07 AM, Duncan Murdoch wrote:

On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:

Thanks everyone for the replies.

It is clear one either needs to write a function or put the unique
entries into another dataframe.

It seems odd R cannot sort a list of unique column entries with 
ease.

Python and SQL can do it with ease.


I've seen several responses that looked pretty simple. It's hard to
beat sort(unique(x)), though there's a fair bit of confusion about
what you actually want.  Maybe you should post an example of the 
code

you'd use in Python?

Duncan Murdoch



QUESTION
Is there a simpler means than other than the unique function to
capture
distinct column entries, then sort that list?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 5:53 PM, Rui Barradas wrote:

Hello,

Inline.

Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu:

Thanks.

sort(unique(Data[[1]]))

This syntax provides row numbers, not column values.


This is not right.
The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]]
extracts the column vector.

As for my previous answer, it was not addressing the question, I
misinterpreted it as being a question on how to sort by numeric
order
when the data is not numeric. Here is a, hopefully, complete 
answer.

Still with package stringr.


cols_to_sort <- 1:4

Data2 <- lapply(Data[cols_to_sort], \(x){
      stringr::str_sort(unique(x), numeric = TRUE)
})


Or using Avi's suggestion of writing a function to do all the
work and
simplify the lapply loop later,


unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...)
Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE)


Hope this helps,

Rui Barradas




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote:

Hi,


Running a simple syntax set to review entries in dataframe
columns.
Here is 

Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Stephen H. Dawson, DSL via R-help

Avi,


Thanks for the detailed reply. I am unable to reply with the same 
detail. Please do not take my lack of response depth as demonstrating a 
lack of appreciation.


My intent was to post an open-ended question asking best process, not 
best practice, to amend sort to unique. I posted code showing how I 
arrived at my present status. The function read.csv reads a file in 
table format and creates a data frame from it. No column types are 
defined in the code. The scale of workload was not considered, as it is 
beyond scope at this point in the dialogue. What is present is defining 
what works, then selecting the efficient option for the scale of 
workload to accomplish.


The biggest benefit of an open-ended question is dialogue members asking 
questions to both confirm understandings and explore other 
considerations. Asking a specific question is necessary for an 
open-ended discussion, perhaps two questions. What often destroys an 
open-ended dialogue is placing boundaries on the dialogue.


This dialogue has shown there is no single best process to add sort to 
unique. I am fine with this outcome. It has been time well-spent for me, 
and the dialogue members from what I read in their positions on the 
concept of arranging data to be processed.


My definition of ease is simple: Whatever it takes to do what I need to do.

What is not included in this definition is mastering all aspects of step 
one before I move to step two.


Sort() does care what is fed to it. This has been the case with all 
occurrences of my experiences for both programming and using 
already-built code. Computers have a funny way of doing what they are 
told to do.


I do not want a language that has calculated every possible combination 
of ways to combine functions and already made tens of thousands available.


I look forward to learning more about the over 100 languages you can 
program during my journey to learn more about GNU R.



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 2:17 PM, Avi Gross via R-help wrote:

Stephen,

Languages have their own philosophies and are often focused initially on doing 
specific things well. Later, they tend to accumulate additional functionality 
both in the base language and extensions.

I am wondering if you have explained your need precisely enough to get the 
answers you want.

SQL and Python have their own ways and both have advantages but also huge 
deficiencies relative to just base R.

But there are rules you live with and if you choose day a data.frame to store 
things in, the columns must all be the same length. The unique members of one 
data.frame are likely to not be the same number so storing them in a data.frame 
does not work. They can be stored quite  few other ways, such as a list of 
lists.

And what is your definition of ease? I can program in Python and SQL and way 
over a hundred other languages and I know I need to adapt my thinking to the 
flow of the language and not the other way around. Base R was not designed to 
be like either SQL or Python. But it can be extended quite a few ways to do 
just about anything.

What you ran into for example is the fact that some functionality is more 
selective in what it works on. A data.frame with one column is logically the 
same as a matrix with one column and as a vector but in reality, they are not 
the same thing. Yes, they can be converted into each other fairly trivially. 
Sort() seems to care what you feed it. If you did not worry about efficiency, 
you could have a version of sort that accepts a wide variety of inputs, 
converts any it can to some possibly common internal form, then converts the 
output back into the form it was received in, or uses a command-line option to 
specify the output format. It is not hard in R to make such a function as it 
has the primitives needed to examine an arbitrary object and see what 
dimensions it has for some number of types and so on, and has utilities to do 
the conversion.

If you want a language that has calculated every possible combination of ways to combine functions and 
already made tens of thousands available, good luck. What languages (including Python and R) expect is 
for you to compose such combinations yourself in one of many ways. The annoying discussions here between 
purists and those wanting to use pre-made packages aside, your question can be handled in many of the 
ways we already discussed. They include making your own (often very small) function that implements 
consolidating the many steps into one logical step. It can mean using pipelines like the new 
"|>" operator recently added to base R or the older versions often used in the tidyverse 
packages like "%>%".

You want to take a data.frame and select a column at a time and ask for it to be made into unique 
values then ordered and shown. So you want a VECTOR and your initial use of the "[" 

Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Duncan Murdoch

On 22/12/2021 10:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

Both syntax options work to render the correct (unique) output. However,
the output is rendered as horizontal. What needs to happen to get the
output to render vertical, please?


The result of those expressions is a vector of the same type as the 
column, so your question is really about how to get a vector to print 
one element per line.


Probably the simplest way is to put the vector in a dataframe (or 
matrix, or tibble, depending on which formatting you prefer).  For example,


>   v <- c("red", "green", "blue")
>   data.frame(v)
  v
1   red
2 green
3  blue

If you want a more minimal display, try

> cat(v, sep = "\n")
red
green
blue

or

> cat(format(v, justify = "right"), sep = "\n")
  red
green
 blue

If you want this to happen when you auto-print the object, you can give 
it a class attribute and write a function to print that class, e.g.


>  class(v) <- "oneperline"
>
>   print.oneperline <- function(x, ...) cat(format(x, justify = 
"right"), sep = "\n")

>
>   v
  red
green
 blue

Duncan Murdoch




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

sort(unique(Data[1]))
Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
decreasing)) :
      undefined columns selected


That's the wrong syntax:  Data[1] is not "column one of Data". Use
Data[[1]] for that, so

     sort(unique(Data[[1]]))


Actually, I'd probably recommend

   sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list.
Dataframes are lists that look like matrices, but to me the matrix
aspect is usually more intuitive.

Duncan Murdoch



I think Rui already pointed out the typo in the quoted text below...

Duncan Murdoch



The recommended syntax did not work, as listed above.

What I want is the sort of distinct column output. Again, the column
may
be text or numbers. This is a huge analysis effort with data coming at
me from many different sources.


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:07 AM, Duncan Murdoch wrote:

On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:

Thanks everyone for the replies.

It is clear one either needs to write a function or put the unique
entries into another dataframe.

It seems odd R cannot sort a list of unique column entries with ease.
Python and SQL can do it with ease.


I've seen several responses that looked pretty simple.  It's hard to
beat sort(unique(x)), though there's a fair bit of confusion about
what you actually want.  Maybe you should post an example of the code
you'd use in Python?

Duncan Murdoch



QUESTION
Is there a simpler means than other than the unique function to
capture
distinct column entries, then sort that list?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 5:53 PM, Rui Barradas wrote:

Hello,

Inline.

Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu:

Thanks.

sort(unique(Data[[1]]))

This syntax provides row numbers, not column values.


This is not right.
The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]]
extracts the column vector.

As for my previous answer, it was not addressing the question, I
misinterpreted it as being a question on how to sort by numeric
order
when the data is not numeric. Here is a, hopefully, complete answer.
Still with package stringr.


cols_to_sort <- 1:4

Data2 <- lapply(Data[cols_to_sort], \(x){
      stringr::str_sort(unique(x), numeric = TRUE)
})


Or using Avi's suggestion of writing a function to do all the
work and
simplify the lapply loop later,


unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...)
Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE)


Hope this helps,

Rui Barradas




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote:

Hi,


Running a simple syntax set to review entries in dataframe
columns.
Here is the working code.

Data <- read.csv("./input/Source.csv", header=T)
describe(Data)
summary(Data)
unique(Data[1])
unique(Data[2])
unique(Data[3])
unique(Data[4])

I would like to add sort the unique entries. The data in the
various
columns are not defined as numbers, but also text. I realize 1 and
10 will not sort properly, as the column is not defined as a
number,
but want to see what I have in the columns viewed as sorted.

QUESTION
What is the best process to sort 

Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Stephen H. Dawson, DSL via R-help

Bert,


Thanks for the reply.

I did not think to put values back into the same column. This action 
would not make sense to me, as it would destroy data integrity. I guess 
adding to a new column in the same container, in this case a dataframe, 
is possible but again not probable with me.


Either way, thanks for confirming all that comes out count-wise in a 
dataframe is what must go back into a dataframe count-wise.


It is nice to have folks on a mailing list that help to flush out what 
one thinks is and will happen with syntax versus what is happening and 
will happen with syntax.



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 3:38 PM, Bert Gunter wrote:

Stephen:
You seem confused about data frames. sort(unique(...)) has no problem
sorting individual columns in a data frame (mod the issues about
mixing numerics and non-numerics that have already been discussed).
But the problem is that the results can *not* be put back in a data
frame because, **by definition** all columns in a data frame **must**
have the same number of values. unique() will change the number of
values in a column if done column by column, e.g. via lapply() or
looping over columns. Consequently, if you do this by lapply(), you'll
get a list back, not a data frame. e.g.


dat <- data.frame(a = rep(3:1,2), b = c(5:1,5))
dat

   a b
1 3 5
2 2 4
3 1 3
4 3 2
5 2 1
6 1 5

## via lapply
dat <- lapply(dat, \(x)sort(unique(x)))
dat  ## a list.

$a
[1] 1 2 3

$b
[1] 1 2 3 4 5


## Trying to do this with an explicit loop results in an error
dat <- data.frame(a = rep(1:3,2), b = c(1:5,5))
for(nm in names(dat))dat[[nm]] <- sort(unique(dat[[nm]])) ## error

Error in `[[<-.data.frame`(`*tmp*`, nm, value = c(1, 2, 3, 4, 5)) :
   replacement has 5 rows, data has 6

OTOH, unique() has a data.frame method which will give unique *rows*
(thinking of a data frame as a matrix-like object with a "dim"
attribute):


dat <- data.frame(a = c(1,2,1), b = c('a','b','a'))
dat

   a b
1 1 a
2 2 b
3 1 a

unique(dat)

   a b
1 1 a
2 2 b

There is no sort() method for data frames as this has no obvious
single interpretation of sorting by whole rows. However, see ?sort for
an example using ?order to carry out one possible interpretation of
sorting by rows.

Bert


On Tue, Dec 21, 2021 at 7:16 AM Stephen H. Dawson, DSL via R-help
 wrote:

Thanks everyone for the replies.

It is clear one either needs to write a function or put the unique
entries into another dataframe.

It seems odd R cannot sort a list of unique column entries with ease.
Python and SQL can do it with ease.

QUESTION
Is there a simpler means than other than the unique function to capture
distinct column entries, then sort that list?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 5:53 PM, Rui Barradas wrote:

Hello,

Inline.

Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu:

Thanks.

sort(unique(Data[[1]]))

This syntax provides row numbers, not column values.

This is not right.
The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]]
extracts the column vector.

As for my previous answer, it was not addressing the question, I
misinterpreted it as being a question on how to sort by numeric order
when the data is not numeric. Here is a, hopefully, complete answer.
Still with package stringr.


cols_to_sort <- 1:4

Data2 <- lapply(Data[cols_to_sort], \(x){
   stringr::str_sort(unique(x), numeric = TRUE)
})


Or using Avi's suggestion of writing a function to do all the work and
simplify the lapply loop later,


unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...)
Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE)


Hope this helps,

Rui Barradas



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote:

Hi,


Running a simple syntax set to review entries in dataframe columns.
Here is the working code.

Data <- read.csv("./input/Source.csv", header=T)
describe(Data)
summary(Data)
unique(Data[1])
unique(Data[2])
unique(Data[3])
unique(Data[4])

I would like to add sort the unique entries. The data in the various
columns are not defined as numbers, but also text. I realize 1 and
10 will not sort properly, as the column is not defined as a number,
but want to see what I have in the columns viewed as sorted.

QUESTION
What is the best process to sort unique output, please?


Thanks.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Adding SORT to UNIQUE

2021-12-22 Thread Stephen H. Dawson, DSL via R-help

Thanks for the reply.

Both syntax options work to render the correct (unique) output. However, 
the output is rendered as horizontal. What needs to happen to get the 
output to render vertical, please?



*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:38 AM, Duncan Murdoch wrote:

On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:

On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:

Thanks for the reply.

sort(unique(Data[1]))
Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
decreasing)) :
     undefined columns selected


That's the wrong syntax:  Data[1] is not "column one of Data". Use
Data[[1]] for that, so

    sort(unique(Data[[1]]))


Actually, I'd probably recommend

  sort(unique(Data[, 1]))

instead.  This treats Data as a matrix rather than as a list. 
Dataframes are lists that look like matrices, but to me the matrix 
aspect is usually more intuitive.


Duncan Murdoch



I think Rui already pointed out the typo in the quoted text below...

Duncan Murdoch



The recommended syntax did not work, as listed above.

What I want is the sort of distinct column output. Again, the column 
may

be text or numbers. This is a huge analysis effort with data coming at
me from many different sources.


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/21/21 11:07 AM, Duncan Murdoch wrote:

On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:

Thanks everyone for the replies.

It is clear one either needs to write a function or put the unique
entries into another dataframe.

It seems odd R cannot sort a list of unique column entries with ease.
Python and SQL can do it with ease.


I've seen several responses that looked pretty simple.  It's hard to
beat sort(unique(x)), though there's a fair bit of confusion about
what you actually want.  Maybe you should post an example of the code
you'd use in Python?

Duncan Murdoch



QUESTION
Is there a simpler means than other than the unique function to 
capture

distinct column entries, then sort that list?


*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 5:53 PM, Rui Barradas wrote:

Hello,

Inline.

Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu:

Thanks.

sort(unique(Data[[1]]))

This syntax provides row numbers, not column values.


This is not right.
The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]]
extracts the column vector.

As for my previous answer, it was not addressing the question, I
misinterpreted it as being a question on how to sort by numeric 
order

when the data is not numeric. Here is a, hopefully, complete answer.
Still with package stringr.


cols_to_sort <- 1:4

Data2 <- lapply(Data[cols_to_sort], \(x){
     stringr::str_sort(unique(x), numeric = TRUE)
})


Or using Avi's suggestion of writing a function to do all the 
work and

simplify the lapply loop later,


unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...)
Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE)


Hope this helps,

Rui Barradas




*Stephen Dawson, DSL*
/Executive Strategy Consultant/
Business & Technology
+1 (865) 804-3454
http://www.shdawson.com 


On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wrote:

Hi,


Running a simple syntax set to review entries in dataframe 
columns.

Here is the working code.

Data <- read.csv("./input/Source.csv", header=T)
describe(Data)
summary(Data)
unique(Data[1])
unique(Data[2])
unique(Data[3])
unique(Data[4])

I would like to add sort the unique entries. The data in the 
various

columns are not defined as numbers, but also text. I realize 1 and
10 will not sort properly, as the column is not defined as a 
number,

but want to see what I have in the columns viewed as sorted.

QUESTION
What is the best process to sort unique output, please?


Thanks.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.













__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

[R] visualizing CTM topic models in R...

2021-12-22 Thread akshay kulkarni
dear members,
 I am using topicmodels package in R to segregate some 
news articles related to stocks.

I know that I can visualize LDA models with LDAvis package. But I have not 
stumbled upon any package or a base function(in the internet) to visualize the 
CTM model. Any ideas therefor?

Thanking You,
Yours sincerely,
AKSHAY M KULKARNI

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.