Re: [R] Date

2021-11-04 Thread Jeff Newmiller
Then you are looking at a different file... check your filenames. You have 
imported the column as character, and R has not yet recognized that it is 
supposed to be a date, so it can only show what it found.

You will almost certainly find your error if you make a reproducible example.

On November 4, 2021 5:30:22 PM PDT, Val  wrote:
>Jeff,
>
>The date from y data file looks like as follow in the Linux environment,
>My_date
>2019-09-16
>2021-02-21
>2021-02-22
>2017-10-11
>2017-10-10
>2018-11-11
>2017-10-27
>2017-10-30
>2019-05-20
>
>On Thu, Nov 4, 2021 at 5:00 PM Jeff Newmiller  wrote:
>>
>> You are claiming behavior that is not something R does, but is something 
>> Excel does constantly.
>>
>> Compare what your data file looks like using a text editor with what R has 
>> imported. Absolutely do not use a spreadsheet program to do this.
>>
>> On November 4, 2021 2:43:25 PM PDT, Val  wrote:
>> >IHi All, l,
>> >
>> >I am  reading a csv file  and one of the columns is named as  "mydate"
>> > with this form, 2019-09-16.
>> >
>> >I am reading this file as
>> >
>> >dat=read.csv("myfile.csv")
>> > the structure of the data looks like as follow
>> >
>> >str(dat)
>> >mydate : chr  "09/16/2019" "02/21/2021" "02/22/2021" "10/11/2017" ...
>> >
>> >Please note the format  has  changed from -mm-dd  to mm/dd/
>> >When I tried to change this   as a Date using
>> >
>> >as.Date(as.Date(mydate, format="%m/%d/%Y" )
>> >I am getting this error message
>> >Error in charToDate(x) :
>> >  characte string is not in a standard unambiguous format
>> >
>> >My question is,
>> >1. how can I read the file as it is (i.e., without changing the date 
>> >format) ?
>> >2. why does R change the date format?
>> >
>> >Thank you,
>> >
>> >__
>> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>> --
>> Sent from my phone. Please excuse my brevity.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date

2021-11-04 Thread Val
Jeff,

The date from y data file looks like as follow in the Linux environment,
My_date
2019-09-16
2021-02-21
2021-02-22
2017-10-11
2017-10-10
2018-11-11
2017-10-27
2017-10-30
2019-05-20

On Thu, Nov 4, 2021 at 5:00 PM Jeff Newmiller  wrote:
>
> You are claiming behavior that is not something R does, but is something 
> Excel does constantly.
>
> Compare what your data file looks like using a text editor with what R has 
> imported. Absolutely do not use a spreadsheet program to do this.
>
> On November 4, 2021 2:43:25 PM PDT, Val  wrote:
> >IHi All, l,
> >
> >I am  reading a csv file  and one of the columns is named as  "mydate"
> > with this form, 2019-09-16.
> >
> >I am reading this file as
> >
> >dat=read.csv("myfile.csv")
> > the structure of the data looks like as follow
> >
> >str(dat)
> >mydate : chr  "09/16/2019" "02/21/2021" "02/22/2021" "10/11/2017" ...
> >
> >Please note the format  has  changed from -mm-dd  to mm/dd/
> >When I tried to change this   as a Date using
> >
> >as.Date(as.Date(mydate, format="%m/%d/%Y" )
> >I am getting this error message
> >Error in charToDate(x) :
> >  characte string is not in a standard unambiguous format
> >
> >My question is,
> >1. how can I read the file as it is (i.e., without changing the date format) 
> >?
> >2. why does R change the date format?
> >
> >Thank you,
> >
> >__
> >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date

2021-11-04 Thread Spencer Graves
	  My speculation is that Microsoft Excel recognized that as a date and 
saved it in the "mm/dd/" format you saw when reading it into R with 
dat=read.csv("myfile.csv").



	  "str" told you the format.  You can convert that from character to 
Date using as.Date(dat$mydate, '%m/%d/%Y'), as documented in 
help('as.Date').



NOTE: The error message, "character string is not in a standard 
unambiguous format" is almost appropriate:  In this case, it's clear 
that "09/16/2019" refers to month 09, day 16, and year 2019.  However, 
if it were "09/06/2019", we would not know if it were September 6 or 9 
June of 2019.  If it were  "09/06/08", we would have the added 
possibility with the year first, followed by month and day:  June 8, 
2009.  This ambiguity is resolved most forcefully by ISO 8601.



  Hope this helps.
  Spencer Graves


On 11/4/21 5:30 PM, PIKAL Petr wrote:

Hi

Not sure why the date format was changed but if I am correct R do not read 
dates as dates but as character vector. You need to transfer such columns to 
dates by asDate. The error is probably from your use two asDate commands.

Cheers
Petr
-Original Message-
From: R-help  On Behalf Of Val
Sent: Thursday, November 4, 2021 10:43 PM
To: r-help@R-project.org (r-help@r-project.org) 
Subject: [R] Date

IHi All, l,

I am  reading a csv file  and one of the columns is named as  "mydate"
  with this form, 2019-09-16.

I am reading this file as

dat=read.csv("myfile.csv")
  the structure of the data looks like as follow

str(dat)
mydate : chr  "09/16/2019" "02/21/2021" "02/22/2021" "10/11/2017" ...

Please note the format  has  changed from -mm-dd  to mm/dd/
When I tried to change this   as a Date using

as.Date(as.Date(mydate, format="%m/%d/%Y" )
I am getting this error message
 Error in charToDate(x) :
   characte string is not in a standard unambiguous format

My question is,
1. how can I read the file as it is (i.e., without changing the date format) ?
2. why does R change the date format?

Thank you,

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date

2021-11-04 Thread PIKAL Petr
Hi

Not sure why the date format was changed but if I am correct R do not read 
dates as dates but as character vector. You need to transfer such columns to 
dates by asDate. The error is probably from your use two asDate commands.

Cheers
Petr
-Original Message-
From: R-help  On Behalf Of Val
Sent: Thursday, November 4, 2021 10:43 PM
To: r-help@R-project.org (r-help@r-project.org) 
Subject: [R] Date

IHi All, l,

I am  reading a csv file  and one of the columns is named as  "mydate"
 with this form, 2019-09-16.

I am reading this file as

dat=read.csv("myfile.csv")
 the structure of the data looks like as follow

str(dat)
mydate : chr  "09/16/2019" "02/21/2021" "02/22/2021" "10/11/2017" ...

Please note the format  has  changed from -mm-dd  to mm/dd/
When I tried to change this   as a Date using

as.Date(as.Date(mydate, format="%m/%d/%Y" )
I am getting this error message
Error in charToDate(x) :
  characte string is not in a standard unambiguous format

My question is,
1. how can I read the file as it is (i.e., without changing the date format) ?
2. why does R change the date format?

Thank you,

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
partnerů PRECHEZA a.s. jsou zveřejněny na: 
https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
processing and protection of business partner’s personal data are available on 
website: https://www.precheza.cz/en/personal-data-protection-principles/
Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
https://www.precheza.cz/01-dovetek/ | This email and any documents attached to 
it may be confidential and are subject to the legally binding disclaimer: 
https://www.precheza.cz/en/01-disclaimer/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date

2021-11-04 Thread Jim Lemon
Hi Val,
Try this:

dat=read.csv("myfile.csv",stringsAsFactors=FALSE)

However, the apparently silent conversion of format is a mystery to
me. The only time I have struck something like this was when exporting
dates from Excel some years ago, and there was a silent conversion to
mm/dd/ format if the dates were in dd/mm/ format. Could you
post some sample data?

Jim


On Fri, Nov 5, 2021 at 8:44 AM Val  wrote:
>
> IHi All, l,
>
> I am  reading a csv file  and one of the columns is named as  "mydate"
>  with this form, 2019-09-16.
>
> I am reading this file as
>
> dat=read.csv("myfile.csv")
>  the structure of the data looks like as follow
>
> str(dat)
> mydate : chr  "09/16/2019" "02/21/2021" "02/22/2021" "10/11/2017" ...
>
> Please note the format  has  changed from -mm-dd  to mm/dd/
> When I tried to change this   as a Date using
>
> as.Date(as.Date(mydate, format="%m/%d/%Y" )
> I am getting this error message
> Error in charToDate(x) :
>   characte string is not in a standard unambiguous format
>
> My question is,
> 1. how can I read the file as it is (i.e., without changing the date format) ?
> 2. why does R change the date format?
>
> Thank you,
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Date

2021-11-04 Thread Jeff Newmiller
You are claiming behavior that is not something R does, but is something Excel 
does constantly.

Compare what your data file looks like using a text editor with what R has 
imported. Absolutely do not use a spreadsheet program to do this.

On November 4, 2021 2:43:25 PM PDT, Val  wrote:
>IHi All, l,
>
>I am  reading a csv file  and one of the columns is named as  "mydate"
> with this form, 2019-09-16.
>
>I am reading this file as
>
>dat=read.csv("myfile.csv")
> the structure of the data looks like as follow
>
>str(dat)
>mydate : chr  "09/16/2019" "02/21/2021" "02/22/2021" "10/11/2017" ...
>
>Please note the format  has  changed from -mm-dd  to mm/dd/
>When I tried to change this   as a Date using
>
>as.Date(as.Date(mydate, format="%m/%d/%Y" )
>I am getting this error message
>Error in charToDate(x) :
>  characte string is not in a standard unambiguous format
>
>My question is,
>1. how can I read the file as it is (i.e., without changing the date format) ?
>2. why does R change the date format?
>
>Thank you,
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

-- 
Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Date

2021-11-04 Thread Val
IHi All, l,

I am  reading a csv file  and one of the columns is named as  "mydate"
 with this form, 2019-09-16.

I am reading this file as

dat=read.csv("myfile.csv")
 the structure of the data looks like as follow

str(dat)
mydate : chr  "09/16/2019" "02/21/2021" "02/22/2021" "10/11/2017" ...

Please note the format  has  changed from -mm-dd  to mm/dd/
When I tried to change this   as a Date using

as.Date(as.Date(mydate, format="%m/%d/%Y" )
I am getting this error message
Error in charToDate(x) :
  characte string is not in a standard unambiguous format

My question is,
1. how can I read the file as it is (i.e., without changing the date format) ?
2. why does R change the date format?

Thank you,

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Thanks for all the patient help

2021-11-04 Thread Rich Shepard

I want to thank all of you for your help the past few days. I now have all
data sets imported, datetime columns added, and distribution stats
calculated for each. No errors.

My searches on the web for what to do when problems() produces no results,
and the few comments on my stackexchange post, were not helpful. So I fell
back on how I debugged FORTRAN code in the 1970s: cut the data file in half
and run that. If the issue still exists, cut that half in half again. Rinse
and repeat until the problem line is found.

It turned out that one 416K line data file had a few double commas between
columns about line 1530 in the file. Finding those double commas showed me
the problem and the emacs search-and-replace changed the two commas to a
single comma.

Now all's well. It took another mug of coffee and a lunch break to work my
way down to the beginning of the large file to the errors, but it worked
just as it did before FORTRAN IV had a debugger. :-)

Stay well, all!

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names.data.frame?

2021-11-04 Thread Duncan Murdoch

On 04/11/2021 12:36 p.m., Bert Gunter wrote:

" Running `methods(names)` lists quite a few methods, ..."

Depending on what packages you have loaded of course.


Yes, just one is in a base package.

Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mean() produces NA on double column [FIXED]

2021-11-04 Thread Rich Shepard

On Thu, 4 Nov 2021, Rui Barradas wrote:


Maybe
which(is.na(pdx_stage$ft))
Have you tried na.rm = TRUE?
mean(pdx_stage$ft, na.rm = TRUE)


Rui,

I just scrolled through the data file.

Yes, there are severeal NAs when the equipment was down and I hadn't put
na.rm = TRUE in the read_csv() import command.

I hadn't caught them before.

Thanks,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mean() produces NA on double column

2021-11-04 Thread Rui Barradas

Hello,

Maybe


which(is.na(pdx_stage$ft))


Have you tried na.rm = TRUE?


mean(pdx_stage$ft, na.rm = TRUE)


Hope this helps,

Rui Barradas

Às 18:34 de 04/11/21, Rich Shepard escreveu:

I'm not seeing what's different about this tibble so that mean() returns NA
on a column of doubles:

head(pdx_stage)

# A tibble: 6 × 8
   site_nbr  year   mon   day    hr   min tz   ft
         
1 14211720  2007    10 1 1 0 PDT    3.21
2 14211720  2007    10 1 1    30 PDT    3.12
3 14211720  2007    10 1 2 0 PDT    2.89
4 14211720  2007    10 1 2    30 PDT    2.65
5 14211720  2007    10 1 3 0 PDT    2.38
6 14211720  2007    10 1 3    30 PDT    2.14

mean(pdx_stage$ft)

[1] NA

Other tibbles have doubles in the value column which mean() finds. For
example:

head(pdx_depth_sens)

# A tibble: 6 × 8
   site_nbr  year   mon   day    hr   min tz   ft
         
1 14211720  2009 1    22 0 0 PST    5.68
2 14211720  2009 1    22 0    30 PST    5.66
3 14211720  2009 1    22 1 0 PST    5.69
4 14211720  2009 1    22 1    30 PST    5.75
5 14211720  2009 1    22 2 0 PST    5.85
6 14211720  2009 1    22 2    30 PST    5.98

mean(pdx_depth_sens$ft)

[1] 8.196686

How do I isolate the source of this issue?

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mean() produces NA on double column

2021-11-04 Thread Rich Shepard

I'm not seeing what's different about this tibble so that mean() returns NA
on a column of doubles:

head(pdx_stage)

# A tibble: 6 × 8
  site_nbr  year   mon   dayhr   min tz   ft

1 14211720  200710 1 1 0 PDT3.21
2 14211720  200710 1 130 PDT3.12
3 14211720  200710 1 2 0 PDT2.89
4 14211720  200710 1 230 PDT2.65
5 14211720  200710 1 3 0 PDT2.38
6 14211720  200710 1 330 PDT2.14

mean(pdx_stage$ft)

[1] NA

Other tibbles have doubles in the value column which mean() finds. For
example:

head(pdx_depth_sens)

# A tibble: 6 × 8
  site_nbr  year   mon   dayhr   min tz   ft

1 14211720  2009 122 0 0 PST5.68
2 14211720  2009 122 030 PST5.66
3 14211720  2009 122 1 0 PST5.69
4 14211720  2009 122 130 PST5.75
5 14211720  2009 122 2 0 PST5.85
6 14211720  2009 122 230 PST5.98

mean(pdx_depth_sens$ft)

[1] 8.196686

How do I isolate the source of this issue?

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Ordenar data frame por variables

2021-11-04 Thread Carlos J. Gil Bellosta
Hola, ¿qué tal?

Si vas a querer usar dplyr, que es lo que parece, puedes encontrar algunos
ejemplos muy parecidos al tuyo aquí
. En concreto, aunque has
agrupado correctamente, te ha faltado decir qué es lo que quieres hacer con
esos grupos (la parte del summary).

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El jue, 4 nov 2021 a las 18:01, Maximiliano Asencio ()
escribió:

> Hola, cómo están.
> Resulta que tengo una base de datos con los promedios de notas de distintos
> alumnos en distintos colegios, para distintos cursos, y necesito agrupar
> los alumnos por colegio y curso, para luego calcular sus notas según
> percentil (siendo el promedio más alto el percentil 1), pero no tengo idea
> de cómo hacerlo!
> Probé la función group_by:
>
> Df = Df %<% group_by(RBD, Curso)
>
> (Rbd es el código por colegio)
> Pero no hizo nada con el data frame. Tampoco puedo hacerlo con ifelse(), ya
> que son miles de colegios. La idea es que los datos queden agrupados, cada
> estudiante con sus notas según colegio y curso. Alguien tiene una idea de
> cómo hacerlo??
>
> Un abrazo,
> Max
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R-es] Ordenar data frame por variables

2021-11-04 Thread Maximiliano Asencio
Hola, cómo están.
Resulta que tengo una base de datos con los promedios de notas de distintos
alumnos en distintos colegios, para distintos cursos, y necesito agrupar
los alumnos por colegio y curso, para luego calcular sus notas según
percentil (siendo el promedio más alto el percentil 1), pero no tengo idea
de cómo hacerlo!
Probé la función group_by:

Df = Df %<% group_by(RBD, Curso)

(Rbd es el código por colegio)
Pero no hizo nada con el data frame. Tampoco puedo hacerlo con ifelse(), ya
que son miles de colegios. La idea es que los datos queden agrupados, cada
estudiante con sus notas según colegio y curso. Alguien tiene una idea de
cómo hacerlo??

Un abrazo,
Max

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] names.data.frame?

2021-11-04 Thread Bert Gunter
" Running `methods(names)` lists quite a few methods, ..."

Depending on what packages you have loaded of course.

Bert

On Thu, Nov 4, 2021 at 7:43 AM Duncan Murdoch 
wrote:

> On 04/11/2021 10:38 a.m., Jorgen Harmse via R-help wrote:
> > Can someone please explain what Leonard Mada is trying to do? As far as
> I know, names is not generic and there is no names.data.frame because it’s
> not needed. (A data.frame seems to be just a named list with some extra
> functionality that depends on every element being a vector with the same
> length and some overloading of list functions to ensure that that is always
> true.) The other answers confused me more.
>
> According to the help page, names() is a generic function.  Running
> `methods(names)` lists quite a few methods, but you're right, there's no
> names.data.frame because it's not needed, the default method is fine.
>
> Duncan Murdoch
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What to do when problems() returns nothing [RESOLVED]

2021-11-04 Thread Rich Shepard

On Thu, 4 Nov 2021, Micha Silver wrote:


Why are you importing the last "ft" column as an integer when it's clearly
decimal data?


Micha,

Probably because I was still thinking of the discharge data which are
integers. That explains all the issues.

Mea culpa!

Many thanks for seeing what I kept missing,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What to do when problems() returns nothing

2021-11-04 Thread Micha Silver



On 04/11/2021 16:02, Rich Shepard wrote:

On Thu, 4 Nov 2021, Ben Tupper wrote:

  )

The gzipped data file can be downloaded from 
.



This seems to work fine for me:


library(readr)
cor_stage_file <- "cor-stage.csv"
cor_stage <- read_csv(cor_stage_file,
  col_types = list(site_nbr = col_integer(),
   year = col_character(),
   mon = col_character(),
   day = col_character(),
   hr = col_character(),
   min = col_character(),
   tz = col_character(),
   ft = col_double()))

# Add a proper datetime column
date_string <- with(cor_stage, paste(year, mon, day, sep="-"))
time_string <- with(cor_stage, paste(hr, min, "00", sep=":"))
cor_stage['date_time'] <- as.POSIXct(paste(date_string, time_string, 
cor_stage$tz))


Why are you importing the last "ft" column as an integer when it's 
clearly decimal data?



--
Micha Silver
Ben Gurion Univ.
Sde Boker, Remote Sensing Lab
cell: +972-523-665918

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names.data.frame?

2021-11-04 Thread Duncan Murdoch

On 04/11/2021 10:38 a.m., Jorgen Harmse via R-help wrote:

Can someone please explain what Leonard Mada is trying to do? As far as I know, 
names is not generic and there is no names.data.frame because it’s not needed. 
(A data.frame seems to be just a named list with some extra functionality that 
depends on every element being a vector with the same length and some 
overloading of list functions to ensure that that is always true.) The other 
answers confused me more.


According to the help page, names() is a generic function.  Running 
`methods(names)` lists quite a few methods, but you're right, there's no 
names.data.frame because it's not needed, the default method is fine.


Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names.data.frame?

2021-11-04 Thread Jorgen Harmse via R-help
Can someone please explain what Leonard Mada is trying to do? As far as I know, 
names is not generic and there is no names.data.frame because it’s not needed. 
(A data.frame seems to be just a named list with some extra functionality that 
depends on every element being a vector with the same length and some 
overloading of list functions to ensure that that is always true.) The other 
answers confused me more.

Regards,
Jorgen Harmse.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Live Online Training for High School Teachers and Students

2021-11-04 Thread Thierry Onkelinx via R-help
Dear Tracy,

Maybe a workshop of Data Carpentry (https://datacarpentry.org/) might be
relevant.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkel...@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///




Op do 4 nov. 2021 om 08:07 schreef Tracy Lenz :

> Hi,
>
> I am looking for live training that can be conducted via Zoom or another
> online platform to assist high school teachers and students who are working
> with R. These teachers and students are using R at a very basic level.
> They've reviewed a variety of beginner-level texts and videos on R, but
> they continue to encounter issues that could be resolved in a session with
> someone who is more familiar with R. I'm not looking for a long-term
> solution such as a Code Academy course; rather, this session would be
> intended as a brief beginner's introduction to R as well as a Q for
> specific use cases and troubleshooting. I've searched online for such
> offerings but have not found anything. If anyone has any advice, I'd
> appreciate it. Thanks!
>
> Tracy Lenz
>
>
>
>
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What to do when problems() returns nothing

2021-11-04 Thread Rich Shepard

On Thu, 4 Nov 2021, Ben Tupper wrote:


The help for problems() shows that the expected argument has a default
value of .Last.value. If you don't provide the input argument, it just
uses the last thing your R session evaluated. That's great if you run
problems() right after your issues arises. But you have inserted
stop_for_problems() before you get a chance to run problems(). So, in your
case, if you want to inspect the problems associated with x, you should
provide x explicitly ala problems(x).


Ben,

I've isolated one data file to import by removing both stop_for_problems()
and problems():
library(tidyverse)

cor_stage <- read_csv("../data/cor-stage.csv", col_names = TRUE,
  col_types = list (
  site_nbr = col_character(),
  year = col_integer(),
  mon = col_integer(),
  day = col_integer(),
  hr = col_double(),
  min = col_double(),
  ft = col_integer())
  )

The gzipped data file can be downloaded from .

R imports the file but when I look at it the last column has not been
imported and problems() doesn't return them:


source('import2.r')
cor_stage

# A tibble: 415,903 × 8
   site_nbr  year   mon   dayhr   min tz   ft
 
 1 14171600  20091023 0 0 PDT  NA
 2 14171600  20091023 015 PDT  NA
 3 14171600  20091023 030 PDT  NA
 4 14171600  20091023 045 PDT  NA
 5 14171600  20091023 1 0 PDT  NA
 6 14171600  20091023 115 PDT  NA
 7 14171600  20091023 130 PDT  NA
 8 14171600  20091023 145 PDT  NA
 9 14171600  20091023 2 0 PDT  NA
10 14171600  20091023 215 PDT  NA
# … with 415,893 more rows
Warning message:
One or more parsing issues, see `problems()` for details 

problems()



There are 20 .csv files; a few import properly the rest don't. I've not
before needed to import this many data files for a project but using
read.csv() hasn't failed to import the data column, either.

Thanks,

Rich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What to do when problems() returns nothing

2021-11-04 Thread Ben Tupper
Hi,

The help for problems() shows that the expected argument has a default
value of .Last.value.  If you don't provide the input argument, it just
uses the last thing your R session evaluated.  That's great if you run
problems() right after your issues arises.  But you have inserted
stop_for_problems() before you get a chance to run problems().  So, in your
case, if you want to inspect the problems associated with x, you
should provide x explicitly ala problems(x).

Cheers,
Ben


On Wed, Nov 3, 2021 at 3:30 PM Rich Shepard 
wrote:

> On Wed, 3 Nov 2021, Bert Gunter wrote:
>
> > More to the point, the tidyverse galaxy tries to largely replace R's
> > standard functionality and has its own help forum. So I think you should
> > post there, rather than here, for questions about it:
> > https://www.tidyverse.org/help/
>
> Bert,
>
> Thank you very much. I am tying to learn tidyverse and had no idea it had
> it's own help.
>
> I will post tidyverse questions there.
>
> Regards,
>
> Rich
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ben Tupper (he/him)
Bigelow Laboratory for Ocean Science
East Boothbay, Maine
http://www.bigelow.org/
https://eco.bigelow.org

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [External Email] Live Online Training for High School Teachers and Students

2021-11-04 Thread Christopher W Ryan via R-help
Tracy--

I enjoy doing this sort of thing. Over the years I've done two full-day
"introduction to R" workshops for high school students. The workshops also
inevitably get into software-agnostic, basic issues about how to think
about data, and how to measure, record, and store it---which is all pretty
cool. They were in-person, pre-pandemic workshops, but I believe could be
adapted to a remote, online approach.  Feel free to email me.

If you don't already know about it, you might also be interested in the
R-sig-teaching List here:
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching

--Chris Ryan


On Thu, Nov 4, 2021 at 3:06 AM Tracy Lenz  wrote:

> Hi,
>
> I am looking for live training that can be conducted via Zoom or another
> online platform to assist high school teachers and students who are working
> with R. These teachers and students are using R at a very basic level.
> They've reviewed a variety of beginner-level texts and videos on R, but
> they continue to encounter issues that could be resolved in a session with
> someone who is more familiar with R. I'm not looking for a long-term
> solution such as a Code Academy course; rather, this session would be
> intended as a brief beginner's introduction to R as well as a Q for
> specific use cases and troubleshooting. I've searched online for such
> offerings but have not found anything. If anyone has any advice, I'd
> appreciate it. Thanks!
>
> Tracy Lenz
>
>
>
>
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] names.data.frame?

2021-11-04 Thread Duncan Murdoch

On 03/11/2021 3:42 p.m., Duncan Murdoch wrote:

On 03/11/2021 2:54 p.m., Andrew Simmons wrote:

 ... deletions ...

As a side note, I would suggest making your class through the methods
package, with methods::setClass("pm", ...)
See the documentation for setClass for more details, it's the recommended
way to define classes in R.


That's incorrect.  It is *a* recommended way to define classes in R, but
there are other recommended ways as well, for doing other kinds of
things, and many people stick with the S3 system without formal classes
at all.


In fact, Andrew was paraphrasing some documentation, so his statement 
isn't incorrect.


Duncan Murdoch



If you're writing a Bioconductor package you should probably use the
formal methods.  If you're writing code for other purposes, you should
think about whether you need formal classes at all, and if so, whether
the methods package formalism is a match for what you're doing.

Duncan Murdoch



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] severe bug in LICORS/kmeanspp

2021-11-04 Thread fritzke
Hello,

I found a severe bug in the function kmeanspp (k-means++ implementation) in the 
LICORS package 
(https://www.rdocumentation.org/packages/LICORS/versions/0.2.0/topics/kmeanspp) 
and need some advice on how to handle this (therefore posting here).
 
Since LICORS has not been updated since 2013, I am not sure if there is an 
active maintainer. The named maintainer works for Google since 2014 and may 
have abandoned the package.
 
On the other hand, this is one of the few implementations of k-means++ in R and 
coming up first when searching.
 
The bug leads to inferior results.  In its current form, the results were much 
worse than those of Hartigan-Wong (the default for k-means in the stats 
package) for all test problems I tried. However, after fixing the bug, kmeanspp 
found better results than Hartigan-Wong for all those problems. So anyone 
comparing those two algorithms based on the current implementations in R may 
have come to the wrong conclusions. BTW: The Hartigan-Wong implementation 
(Fortran) is *much* faster than LICOR/kmeanspp, which is written in pure R 
(before making one call to stats/kmeans at the end), but that is not the point 
here.
 
The bug concerns a distance computation which should be a matrix of distances 
of all data vectors and all current codebook vectors, but is not. The code and 
an example illustrating the problem is shown below. Basically, to subtract a 
vector from a matrix, one has to convert the vector into a matrix where all 
rows are just copies of the vector. Details are shown below. The fix is trivial.
 
I stumbled upon this because kmeanspp produced counterintuitive and very poor 
results.
 
Is this of any interest? Should kmeanspp in LICORS be fixed? I have neither the 
experience nor the time to take over an R-package. I could write a more formal 
bug report, if required, but I initially would like to know if this is 
considered relevant.

Suggestions/comments are welcome.
 
Kind Regards
Bernd Fritzke
 
The code 
            if (ndim == 1) {
                dists <- apply(cbind(data[center_ids, ]), 1, 
                  function(center) {
                    rowSums((data - center)^2)
                  })
            }
            else {
                dists <- apply(data[center_ids, ], 1, function(center) {
                  rowSums((data - center)^2)
                })
            }
 
should rather be (the two changed lines are marked as "fixed"):
 
            if (ndim == 1) {
                dists <- apply(cbind(data[center_ids, ]), 1, 
                  function(center) {
                    rowSums((data - rep(center,each=nrow(data) ) )^2) # fixed
                  })
            }
            else {
                dists <- apply(data[center_ids, ], 1, function(center) {
                  rowSums((data - rep(center,each=nrow(data) ) )^2) # fixed
                })
            }
 
Here is some example code illustrating the problem. The code should compute the 
square distances between the six two-dimensional vectors in "data" and three 
vectors which happen to be the elements 1, 2, and 4 in "data".
RSTUDIO

```{r}
data=cbind(1:6,rep(0,6))
print("data")
print(data)
center_ids=c(1,2,4)
print("codebook")
print(data[center_ids, ])

# fixed
dists <- apply(data[center_ids, ], 1, 
               function(center) {
                  rowSums((data - rep(center,each=nrow(data)))^2)
              }
)
print("dists (correct)")
print(dists)

# buggy
dists <- apply(data[center_ids, ], 1, 
               function(center) {
                  rowSums((data - center)^2)
              }
)
print("buggy dists from LICORS")
dists
```
Output:

[1] "data"
     [,1] [,2]
[1,]    1    0
[2,]    2    0
[3,]    3    0
[4,]    4    0
[5,]    5    0
[6,]    6    0
[1] "codebook"
     [,1] [,2]
[1,]    1    0
[2,]    2    0
[3,]    4    0
[1] "dists (correct)"
     [,1] [,2] [,3]
[1,]    0    1    9
[2,]    1    0    4
[3,]    4    1    1
[4,]    9    4    0
[5,]   16    9    1
[6,]   25   16    4
[1] "buggy dists from LICORS"
     [,1] [,2] [,3]
[1,]    1    5   25
[2,]    4    4    4
[3,]    5    5   17
[4,]   16   16   16
[5,]   17   13   17
[6,]   36   36   36
 
--
Dr.-Ing. Bernd Fritzke

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Live Online Training for High School Teachers and Students

2021-11-04 Thread Tracy Lenz
Hi,

I am looking for live training that can be conducted via Zoom or another online 
platform to assist high school teachers and students who are working with R. 
These teachers and students are using R at a very basic level. They've reviewed 
a variety of beginner-level texts and videos on R, but they continue to 
encounter issues that could be resolved in a session with someone who is more 
familiar with R. I'm not looking for a long-term solution such as a Code 
Academy course; rather, this session would be intended as a brief beginner's 
introduction to R as well as a Q for specific use cases and troubleshooting. 
I've searched online for such offerings but have not found anything. If anyone 
has any advice, I'd appreciate it. Thanks!

Tracy Lenz












[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.