Dear PIKAL, Dear all, thanks again a lot. I have finally understood what "in line" means. I would definitely read some "R-intro" and in this moment I am reading a R-tutorial. I would not post formatted messages.
I would ask if it is possible to have some final suggestions: - how to have daily mean; - how to deal with NA; Indeed, after changing the ate format I get Group.1 str1 str2 str3 1 1 -82.43636 -46.12437 -319.2710 2 2 -82.06105 -45.74184 -319.2696 3 3 -82.05527 -45.52650 -319.2416 4 4 -82.03535 -47.59191 -319.2275 ... 31 31 -86.10234 -47.06247 -340.0968 As I said in the previously this is not correct. This because I have not made myself clear about my purpose. As I told you some days ago, I have a *.csv file with hourly data from 10/21/1998 to 12/31/2016. I would like to compute the daily means. Basically, I would like to have the mean of the hourly date for each day from 10/21/1998 to 12/31/2016 and not 31 values. I really really thank, especially for you patience. I am leaning a lot, Again thanks Diego On 2 August 2018 at 09:32, PIKAL Petr <petr.pi...@precheza.cz> wrote: > Hi > > > > see in line (and please do not post HTML formated messages, it could be > scrammbled) > > > > *From:* Diego Avesani <diego.aves...@gmail.com> > *Sent:* Thursday, August 2, 2018 8:56 AM > *To:* jim holtman <jholt...@gmail.com>; PIKAL Petr <petr.pi...@precheza.cz > > > *Cc:* R mailing list <r-help@r-project.org> > *Subject:* Re: [R] read txt file - date - no space > > > > Dear > > > > I have check the one of the line that gives me problem. I mean, which give > NA after R processing. I think that is similar to the others: > > > > You should stop **thinking** and instead do real inspection of „offending“ > values. > > > > 10/12/1998 10:00,0,0,0 > > 10/12/1998 11:00,0,0,0 > > 10/12/1998 12:00,0,0,0 > > 10/12/1998 13:00,0,0,0 > > 10/12/1998 14:00,0,0,0 > > 10/12/1998 15:00,0,0,0 > > 10/12/1998 16:00,0,0,0 > > 10/12/1998 17:00,0,0,0 > > > > These lines do not pose any problem with formating. > > > > > test<-read.table("clipboard", sep=",") > > > str(test) > > 'data.frame': 8 obs. of 4 variables: > > $ V1: Factor w/ 8 levels "10/12/1998 10:00",..: 1 2 3 4 5 6 7 8 > > $ V2: int 0 0 0 0 0 0 0 0 > > $ V3: int 0 0 0 0 0 0 0 0 > > $ V4: int 0 0 0 0 0 0 0 0 > > > as.POSIXct(test$V1, format="%d/%m/%Y %H:%M") > > [1] "1998-12-10 10:00:00 CET" "1998-12-10 11:00:00 CET" > > [3] "1998-12-10 12:00:00 CET" "1998-12-10 13:00:00 CET" > > [5] "1998-12-10 14:00:00 CET" "1998-12-10 15:00:00 CET" > > [7] "1998-12-10 16:00:00 CET" "1998-12-10 17:00:00 CET" > > > > > > @jim: It seems that you suggestion is focus on reading data from the > terminal. It is possible to apply it to a *.csv file? > > > > @Pikal: Could it be that there are some date conversion error? > > > > Well, your str(MyData) result suggest, that conversion from character to > POSIX was done correctly (at least partly). > > > > However NAs in date column you posted in second mail suggest, that some > values in the input are probably formated differently and they are changed > to NA during POSIX conversion. > > > > You could check which values are problematic if instead directly changing > date column to POSIX you put a new column to you data with converted POSIX > values > > > > So read your data from csv file and change date to POSIX but store it in > different column of data frame. > > > > MyData$date2 <- as.POSIXct(MyData$date, format="%d/%m/%Y %H:%M") > > > > and check which values in your original file are formated differently. > > > > something like > > MyData$date[is.na(MyData$date2)] > > > > However your (very basic) questions suggest, that you have only minor > understanding what are R objects, how to check, inspect and manipulate > them. You could do a big favour to yourself going through basic > documentation as I suggested before. > > > > Cheers > > Petr > > > > Thanks again, > > Diego > > > > > Diego > > > > On 1 August 2018 at 17:01, jim holtman <jholt...@gmail.com> wrote: > > > Try this: > > > > > library(lubridate) > > > library(tidyverse) > > > input <- read.csv(text = "date,str1,str2,str3 > > + 10/1/1998 0:00,0.6,0,0 > > + 10/1/1998 1:00,0.2,0.2,0.2 > > + 10/1/1998 2:00,0.6,0.2,0.4 > > + 10/1/1998 3:00,0,0,0.6 > > + 10/1/1998 4:00,0,0,0 > > + 10/1/1998 5:00,0,0,0 > > + 10/1/1998 6:00,0,0,0 > > + 10/1/1998 7:00,0.2,0,0", as.is = TRUE) > > > # convert the date and add the "day" so summarize > > > input <- input %>% > > + mutate(date = mdy_hm(date), > > + day = floor_date(date, unit = 'day') > > + ) > > > > > > by_day <- input %>% > > + group_by(day) %>% > > + summarise(m_s1 = mean(str1), > > + m_s2 = mean(str2), > > + m_s3 = mean(str3) > > + ) > > > > > > by_day > > # A tibble: 1 x 4 > > day m_s1 m_s2 m_s3 > > <dttm> <dbl> <dbl> <dbl> > > 1 1998-10-01 00:00:00 0.200 0.0500 0.150 > > > Jim Holtman > *Data Munger Guru* > > > *What is the problem that you are trying to solve? Tell me what you want > to do, not how you want to do it.* > > > > > > On Tue, Jul 31, 2018 at 11:54 PM Diego Avesani <diego.aves...@gmail.com> > wrote: > > Dear all, > I am sorry, I did a lot of confusion. I am sorry, I have to relax and stat > all again in order to understand. > If I could I would like to start again, without mixing strategy and waiting > for your advice. > > I am really appreciate you help, really really. > Here my new file, a *.csv file (buy the way, it is possible to attach it in > the mailing list?) > > date,str1,str2,str3 > 10/1/1998 0:00,0.6,0,0 > 10/1/1998 1:00,0.2,0.2,0.2 > 10/1/1998 2:00,0.6,0.2,0.4 > 10/1/1998 3:00,0,0,0.6 > 10/1/1998 4:00,0,0,0 > 10/1/1998 5:00,0,0,0 > 10/1/1998 6:00,0,0,0 > 10/1/1998 7:00,0.2,0,0 > > > I read it as: > MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",") > > at this point I would like to have the daily mean. > What would you suggest? > > Really Really thanks, > You are my lifesaver > > Thanks > > > > Diego > > > On 1 August 2018 at 01:01, Jeff Newmiller <jdnew...@dcn.davis.ca.us> > wrote: > > > ... and the most common source of NA values in time data is wrong > > timezones. You really need to make sure the timezone that is assumed when > > the character data are converted to POSIXt agrees with the data. In most > > cases the easiest way to insure this is to use > > > > Sys.setenv(TZ="US/Pacific") > > > > or whatever timezone from > > > > OlsonNames() > > > > corresponds with your data. Execute this setenv function before the > > strptime or as.POSIXct() function call. > > > > You can use > > > > MyData[ is.na(MyData$datetime), ] > > > > to see which records are failing to convert time. > > > > [1] https://github.com/jdnewmil/eci298sp2016/blob/master/QuickHowto1 > > > > On July 31, 2018 3:04:05 PM PDT, Jim Lemon <drjimle...@gmail.com> wrote: > > >Hi Diego, > > >I think the error is due to NA values in your data file. If I extend > > >your example and run it, I get no errors: > > > > > >MyData<-read.table(text="103001930 103001580 103001530 > > >1998-10-01 00:00:00 0.6 0 0 > > >1998-10-01 01:00:00 0.2 0.2 0.2 > > >1998-10-01 02:00:00 0.6 0.2 0.4 > > >1998-10-01 03:00:00 0 0 0.6 > > >1998-10-01 04:00:00 0 0 0 > > >1998-10-01 05:00:00 0 0 0 > > >1998-10-01 06:00:00 0 0 0 > > >1998-10-01 07:00:00 0.2 0 0 > > >1998-10-01 08:00:00 0.6 0 0 > > >1998-10-01 09:00:00 0.2 0.2 0.2 > > >1998-10-01 10:00:00 0.6 0.2 0.4 > > >1998-10-01 11:00:00 0 0 0.6 > > >1998-10-01 12:00:00 0 0 0 > > >1998-10-01 13:00:00 0 0 0 > > >1998-10-01 14:00:00 0 0 0 > > >1998-10-01 15:00:00 0.2 0 0 > > >1998-10-01 16:00:00 0.6 0 0 > > >1998-10-01 17:00:00 0.2 0.2 0.2 > > >1998-10-01 18:00:00 0.6 0.2 0.4 > > >1998-10-01 19:00:00 0 0 0.6 > > >1998-10-01 20:00:00 0 0 0 > > >1998-10-01 21:00:00 0 0 0 > > >1998-10-01 22:00:00 0 0 0 > > >1998-10-01 23:00:00 0.2 0 0 > > >1998-10-02 00:00:00 0.6 0 0 > > >1998-10-02 01:00:00 0.2 0.2 0.2 > > >1998-10-02 02:00:00 0.6 0.2 0.4 > > >1998-10-02 03:00:00 0 0 0.6 > > >1998-10-02 04:00:00 0 0 0 > > >1998-10-02 05:00:00 0 0 0 > > >1998-10-02 06:00:00 0 0 0 > > >1998-10-02 07:00:00 0.2 0 0 > > >1998-10-02 08:00:00 0.6 0 0 > > >1998-10-02 09:00:00 0.2 0.2 0.2 > > >1998-10-02 10:00:00 0.6 0.2 0.4 > > >1998-10-02 11:00:00 0 0 0.6 > > >1998-10-02 12:00:00 0 0 0 > > >1998-10-02 13:00:00 0 0 0 > > >1998-10-02 14:00:00 0 0 0 > > >1998-10-02 15:00:00 0.2 0 0 > > >1998-10-02 16:00:00 0.6 0 0 > > >1998-10-02 17:00:00 0.2 0.2 0.2 > > >1998-10-02 18:00:00 0.6 0.2 0.4 > > >1998-10-02 19:00:00 0 0 0.6 > > >1998-10-02 20:00:00 0 0 0 > > >1998-10-02 21:00:00 0 0 0 > > >1998-10-02 22:00:00 0 0 0 > > >1998-10-02 23:00:00 0.2 0 0", > > >skip=1,stringsAsFactors=FALSE) > > >names(MyData)<-c("date","time","st1","st2","st3") > > >MyData$datetime<-strptime(paste(MyData$date,MyData$time), > > > format="%Y-%m-%d %H:%M:%S") > > >MyData$datetime > > >st1_daily<-by(MyData$st1,MyData$date,mean) > > >st2_daily<-by(MyData$st2,MyData$date,mean) > > >st3_daily<-by(MyData$st3,MyData$date,mean) > > >st1_daily > > >st2_daily > > >st3_daily > > > > > >Try adding na.rm=TRUE to the "by" calls: > > > > > >st1_daily<-by(MyData$st1,MyData$date,mean,na.rm=TRUE) > > >st2_daily<-by(MyData$st2,MyData$date,mean,na.rm=TRUE) > > >st3_daily<-by(MyData$st3,MyData$date,mean,na.rm=TRUE) > > > > > >Jim > > > > > >On Tue, Jul 31, 2018 at 11:11 PM, Diego Avesani > > ><diego.aves...@gmail.com> wrote: > > >> Dear all, > > >> > > >> I have still problem with date. > > >> Could you please tel me how to use POSIXct. > > >> Indeed I have found this command: > > >> timeAverage, but I am not able to convert MyDate to properly date. > > >> > > >> Thank a lot > > >> I hope to no bother you, at least too much > > >> > > >> > > >> Diego > > >> > > >> > > >> On 31 July 2018 at 11:12, Diego Avesani <diego.aves...@gmail.com> > > >wrote: > > >>> > > >>> Dear Jim, Dear all, > > >>> > > >>> thanks a lot. > > >>> > > >>> Unfortunately, I get the following error: > > >>> > > >>> > > >>> st1_daily<-by(MyData$st1,MyData$date,mean) > > >>> Error in tapply(seq_len(0L), list(`MyData$date` = c(913L, 914L, > > >925L, : > > >>> arguments must have same length > > >>> > > >>> > > >>> This is particularly strange. indeed, if I apply > > >>> > > >>> > > >>> mean(MyData$str1,na.rm=TRUE) > > >>> > > >>> > > >>> it works > > >>> > > >>> > > >>> Sorry, I have to learn a lot. > > >>> You are really boosting me > > >>> > > >>> Diego > > >>> > > >>> > > >>> On 31 July 2018 at 11:02, Jim Lemon <drjimle...@gmail.com> wrote: > > >>>> > > >>>> Hi Diego, > > >>>> One way you can get daily means is: > > >>>> > > >>>> st1_daily<-by(MyData$st1,MyData$date,mean) > > >>>> st2_daily<-by(MyData$st2,MyData$date,mean) > > >>>> st3_daily<-by(MyData$st3,MyData$date,mean) > > >>>> > > >>>> Jim > > >>>> > > >>>> On Tue, Jul 31, 2018 at 6:51 PM, Diego Avesani > > ><diego.aves...@gmail.com> > > >>>> wrote: > > >>>> > Dear all, > > >>>> > I have found the error, my fault. Sorry. > > >>>> > There was an extra come in the headers line. > > >>>> > Thanks again. > > >>>> > > > >>>> > If I can I would like to ask you another questions about the > > >imported > > >>>> > data. > > >>>> > I would like to compute the daily average of the different date. > > >>>> > Basically I > > >>>> > have hourly data, I would like to ave the daily mean of them. > > >>>> > > > >>>> > Is there some special commands? > > >>>> > > > >>>> > Thanks a lot. > > >>>> > > > >>>> > > > >>>> > Diego > > >>>> > > > >>>> > > > >>>> > On 31 July 2018 at 10:40, Diego Avesani <diego.aves...@gmail.com> > > >>>> > wrote: > > >>>> >> > > >>>> >> Dear all, > > >>>> >> I move to csv file because originally the date where in csv > > >file. > > >>>> >> In addition, due to the fact that, as you told me, read.csv is a > > >>>> >> special > > >>>> >> case of read.table, I prefer start to learn from the simplest > > >one. > > >>>> >> After that, I will try also the *.txt format. > > >>>> >> > > >>>> >> with read.csv, something strange happened: > > >>>> >> > > >>>> >> This us now the file: > > >>>> >> > > >>>> >> date,st1,st2,st3, > > >>>> >> 10/1/1998 0:00,0.6,0,0 > > >>>> >> 10/1/1998 1:00,0.2,0.2,0.2 > > >>>> >> 10/1/1998 2:00,0.6,0.2,0.4 > > >>>> >> 10/1/1998 3:00,0,0,0.6 > > >>>> >> 10/1/1998 4:00,0,0,0 > > >>>> >> 10/1/1998 5:00,0,0,0 > > >>>> >> 10/1/1998 6:00,0,0,0 > > >>>> >> 10/1/1998 7:00,0.2,0,0 > > >>>> >> 10/1/1998 8:00,0.6,0.2,0 > > >>>> >> 10/1/1998 9:00,0.2,0.4,0.4 > > >>>> >> 10/1/1998 10:00,0,0.4,0.2 > > >>>> >> > > >>>> >> When I apply: > > >>>> >> MyData <- read.csv(file="obs_prec.csv",header=TRUE, sep=",") > > >>>> >> > > >>>> >> this is the results: > > >>>> >> > > >>>> >> 10/1/1998 0:00 0.6 0.00 0.0 NA > > >>>> >> 2 10/1/1998 1:00 0.2 0.20 0.2 NA > > >>>> >> 3 10/1/1998 2:00 0.6 0.20 0.4 NA > > >>>> >> 4 10/1/1998 3:00 0.0 0.00 0.6 NA > > >>>> >> 5 10/1/1998 4:00 0.0 0.00 0.0 NA > > >>>> >> 6 10/1/1998 5:00 0.0 0.00 0.0 NA > > >>>> >> 7 10/1/1998 6:00 0.0 0.00 0.0 NA > > >>>> >> 8 10/1/1998 7:00 0.2 0.00 0.0 NA > > >>>> >> > > >>>> >> I do not understand why. > > >>>> >> Something wrong with date? > > >>>> >> > > >>>> >> really really thanks, > > >>>> >> I appreciate a lot all your helps. > > >>>> >> > > >>>> >> Diedro > > >>>> >> > > >>>> >> > > >>>> >> Diego > > >>>> >> > > >>>> >> > > >>>> >> On 31 July 2018 at 01:25, MacQueen, Don <macque...@llnl.gov> > > >wrote: > > >>>> >>> > > >>>> >>> Or, without removing the first line > > >>>> >>> dadf <- read.table("xxx.txt", stringsAsFactors=FALSE, skip=1) > > >>>> >>> > > >>>> >>> Another alternative, > > >>>> >>> dadf$datetime <- as.POSIXct(paste(dadf$V1,dadf$V2)) > > >>>> >>> since the dates appear to be in the default format. > > >>>> >>> (I generally prefer to work with datetimes in POSIXct class > > >rather > > >>>> >>> than > > >>>> >>> POSIXlt class) > > >>>> >>> > > >>>> >>> -Don > > >>>> >>> > > >>>> >>> -- > > >>>> >>> Don MacQueen > > >>>> >>> Lawrence Livermore National Laboratory > > >>>> >>> 7000 East Ave., L-627 > > >>>> >>> Livermore, CA 94550 > > >>>> >>> 925-423-1062 > > >>>> >>> Lab cell 925-724-7509 > > >>>> >>> > > >>>> >>> > > >>>> >>> > > >>>> >>> On 7/30/18, 4:03 PM, "R-help on behalf of Jim Lemon" > > >>>> >>> <r-help-boun...@r-project.org on behalf of > > >drjimle...@gmail.com> > > >>>> >>> wrote: > > >>>> >>> > > >>>> >>> Hi Diego, > > >>>> >>> You may have to do some conversion as you have three fields > > >in > > >>>> >>> the > > >>>> >>> first line using the default space separator and five > > >fields in > > >>>> >>> subsequent lines. If the first line doesn't contain any > > >important > > >>>> >>> data > > >>>> >>> you can just delete it or replace it with a meaningful > > >header > > >>>> >>> line > > >>>> >>> with five fields and save the file under another name. > > >>>> >>> > > >>>> >>> It looks as thought you have date-time as two fields. If > > >so, you > > >>>> >>> can > > >>>> >>> just read the first field if you only want the date: > > >>>> >>> > > >>>> >>> # assume you have removed the first line > > >>>> >>> dadf<-read.table("xxx.txt",stringsAsFactors=FALSE > > >>>> >>> dadf$date<-as.Date(dadf$V1,format="%Y-%m-%d") > > >>>> >>> > > >>>> >>> If you want the date/time: > > >>>> >>> > > >>>> >>> > > >dadf$datetime<-strptime(paste(dadf$V1,dadf$V2),format="%Y-%m-%d > > >>>> >>> %H:%M:%S") > > >>>> >>> > > >>>> >>> Jim > > >>>> >>> > > >>>> >>> On Tue, Jul 31, 2018 at 12:29 AM, Diego Avesani > > >>>> >>> <diego.aves...@gmail.com> wrote: > > >>>> >>> > Dear all, > > >>>> >>> > > > >>>> >>> > I am dealing with the reading of a *.txt file. > > >>>> >>> > The txt file the following shape: > > >>>> >>> > > > >>>> >>> > 103001930 103001580 103001530 > > >>>> >>> > 1998-10-01 00:00:00 0.6 0 0 > > >>>> >>> > 1998-10-01 01:00:00 0.2 0.2 0.2 > > >>>> >>> > 1998-10-01 02:00:00 0.6 0.2 0.4 > > >>>> >>> > 1998-10-01 03:00:00 0 0 0.6 > > >>>> >>> > 1998-10-01 04:00:00 0 0 0 > > >>>> >>> > 1998-10-01 05:00:00 0 0 0 > > >>>> >>> > 1998-10-01 06:00:00 0 0 0 > > >>>> >>> > 1998-10-01 07:00:00 0.2 0 0 > > >>>> >>> > > > >>>> >>> > If it is possible I have a coupe of questions, which will > > >sound > > >>>> >>> stupid but > > >>>> >>> > they are important to me in order to understand ho R deal > > >with > > >>>> >>> file > > >>>> >>> or date. > > >>>> >>> > > > >>>> >>> > 1) Do I have to convert it to a *csv file? > > >>>> >>> > 2) Can a deal with space and not "," > > >>>> >>> > 3) How can I read date? > > >>>> >>> > > > >>>> >>> > thanks a lot to all of you, > > >>>> >>> > Thanks > > >>>> >>> > > > >>>> >>> > > > >>>> >>> > Diego > > >>>> >>> > > > >>>> >>> > [[alternative HTML version deleted]] > > >>>> >>> > > > >>>> >>> > ______________________________________________ > > >>>> >>> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and > > >more, > > >>>> >>> see > > >>>> >>> > https://stat.ethz.ch/mailman/listinfo/r-help > > >>>> >>> > PLEASE do read the posting guide > > >>>> >>> http://www.R-project.org/posting-guide.html > > >>>> >>> > and provide commented, minimal, self-contained, > > >reproducible > > >>>> >>> code. > > >>>> >>> > > >>>> >>> ______________________________________________ > > >>>> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and > > >more, see > > >>>> >>> https://stat.ethz.ch/mailman/listinfo/r-help > > >>>> >>> PLEASE do read the posting guide > > >>>> >>> http://www.R-project.org/posting-guide.html > > >>>> >>> and provide commented, minimal, self-contained, > > >reproducible > > >>>> >>> code. > > >>>> >>> > > >>>> >>> > > >>>> >> > > >>>> > > > >>> > > >>> > > >> > > > > > >______________________________________________ > > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >https://stat.ethz.ch/mailman/listinfo/r-help > > >PLEASE do read the posting guide > > >http://www.R-project.org/posting-guide.html > > >and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Sent from my phone. Please excuse my brevity. > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > *Osobní údaje: *Informace o zpracování a ochraně osobních údajů > obchodních partnerů PRECHEZA a.s. jsou zveřejněny na: > *https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ > <https://www.precheza.cz/zasady-ochrany-osobnich-udaju/>* | Information > about processing and protection of business partner’s personal data are > available on website: > *https://www.precheza.cz/en/personal-data-protection-principles/ > <https://www.precheza.cz/en/personal-data-protection-principles/>* > > *Důvěrnost: *Tento e-mail a jakékoliv k němu připojené dokumenty jsou > důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení > odpovědnosti: *https://www.precheza.cz/01-dovetek/ > <https://www.precheza.cz/01-dovetek/>* | This email and any documents > attached to it may be confidential and are subject to the legally binding > disclaimer: *https://www.precheza.cz/en/01-disclaimer/ > <https://www.precheza.cz/en/01-disclaimer/>* > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.