[R] Alter character attribute

2010-10-28 Thread LCOG1

Hi everyone

I have some records that include a date attribute for the date and time but
i need to separate the data and analyze it separately in GIS by Month and
Year, so i need to pull these attributes out and create their own attribute
field.  

So the input:
RawData2.. returns

  ID   period_end_date
1 22 9/10/2007 0:00:00
2 44  2/2/2006 0:00:00

and i need to get 
  ID   period_end_dateMonth Year
 22 9/10/2007 0:00:00   9 2007
 44  2/2/2006 0:00:0022006

The below gets me this in list form which i can then add back into the
initial data frame BUT
i have over 4.5 million records and when i run the below it ran for more
than 18 hours and only go through about 2.7 millions records when i gave up
and ended the process.  

So how can i make this more efficient and possibly add the new attributes
(month/year) to the data frame on the fly.

Thanks guys

#Create sample data
RawData2..-data.frame(ID=c(22,44),period_end_date=c(9/10/2007
0:00:00,2/2/2006 0:00:00))

#Create lists to store month and year results
Data.Month_-list()
Data.Year_-list()
#pull out year/month attribute at put in own column
for(i in 1:length(RawData2..$ID)){
 #Select Record
 Data.X-RawData..[i,]
 #Separate date into month, day, and year 
 DateSplit-strsplit(Data.X$period_end_date,/)
 #Select month
 Month-unlist(DateSplit)[1]
 #Separate year from time attribute
 Year.X-strsplit(unlist(DateSplit)[3], )
 Year.Y-unlist(Year.X)[1]
 Data.Month_[[i]]-Month
 Data.Year_[[i]]-Year.Y

}


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Alter-character-attribute-tp3018202p3018202.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alter character attribute

2010-10-28 Thread jim holtman
try this:

 x - read.table(textConnection( ID   date time
+ 1 22 9/10/2007 0:00:00
+ 2 44  2/2/2006 0:00:00), header = TRUE)
 closeAllConnections()
 x
  ID  datetime
1 22 9/10/2007 0:00:00
2 44  2/2/2006 0:00:00
 x$month - sub(^([[:digit:]]+).*, \\1, x$date)
 x$year - sub(.*?([[:digit:]]+)$, \\1, x$date)
 x
  ID  datetime month year
1 22 9/10/2007 0:00:00 9 2007
2 44  2/2/2006 0:00:00 2 2006



On Thu, Oct 28, 2010 at 6:40 PM, LCOG1 jr...@lcog.org wrote:

 Hi everyone

 I have some records that include a date attribute for the date and time but
 i need to separate the data and analyze it separately in GIS by Month and
 Year, so i need to pull these attributes out and create their own attribute
 field.

 So the input:
 RawData2.. returns

  ID   period_end_date
 1 22 9/10/2007 0:00:00
 2 44  2/2/2006 0:00:00

 and i need to get
  ID   period_end_date    Month Year
  22 9/10/2007 0:00:00   9         2007
  44  2/2/2006 0:00:00    2        2006

 The below gets me this in list form which i can then add back into the
 initial data frame BUT
 i have over 4.5 million records and when i run the below it ran for more
 than 18 hours and only go through about 2.7 millions records when i gave up
 and ended the process.

 So how can i make this more efficient and possibly add the new attributes
 (month/year) to the data frame on the fly.

 Thanks guys

 #Create sample data
 RawData2..-data.frame(ID=c(22,44),period_end_date=c(9/10/2007
 0:00:00,2/2/2006 0:00:00))

 #Create lists to store month and year results
 Data.Month_-list()
 Data.Year_-list()
 #pull out year/month attribute at put in own column
 for(i in 1:length(RawData2..$ID)){
     #Select Record
     Data.X-RawData..[i,]
     #Separate date into month, day, and year
     DateSplit-strsplit(Data.X$period_end_date,/)
     #Select month
     Month-unlist(DateSplit)[1]
     #Separate year from time attribute
     Year.X-strsplit(unlist(DateSplit)[3], )
     Year.Y-unlist(Year.X)[1]
     Data.Month_[[i]]-Month
     Data.Year_[[i]]-Year.Y

 }


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Alter-character-attribute-tp3018202p3018202.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alter character attribute

2010-10-28 Thread jim holtman
I didn't see you test so, so here is the solution with your data:

 RawData2..-data.frame(ID=c(22,44),period_end_date=c(9/10/2007 0:00:00,
+ 2/2/2006 0:00:00))
 RawData2..$month - sub(^([[:digit:]]+).*, \\1, 
 RawData2..$period_end_date)
 RawData2..$year - sub(.*/([[:digit:]]+) .*, \\1, 
 RawData2..$period_end_date)
 RawData2..
  ID   period_end_date month year
1 22 9/10/2007 0:00:00 9 2007
2 44  2/2/2006 0:00:00 2 2006



On Thu, Oct 28, 2010 at 6:40 PM, LCOG1 jr...@lcog.org wrote:

 Hi everyone

 I have some records that include a date attribute for the date and time but
 i need to separate the data and analyze it separately in GIS by Month and
 Year, so i need to pull these attributes out and create their own attribute
 field.

 So the input:
 RawData2.. returns

  ID   period_end_date
 1 22 9/10/2007 0:00:00
 2 44  2/2/2006 0:00:00

 and i need to get
  ID   period_end_date    Month Year
  22 9/10/2007 0:00:00   9         2007
  44  2/2/2006 0:00:00    2        2006

 The below gets me this in list form which i can then add back into the
 initial data frame BUT
 i have over 4.5 million records and when i run the below it ran for more
 than 18 hours and only go through about 2.7 millions records when i gave up
 and ended the process.

 So how can i make this more efficient and possibly add the new attributes
 (month/year) to the data frame on the fly.

 Thanks guys

 #Create sample data
 RawData2..-data.frame(ID=c(22,44),period_end_date=c(9/10/2007
 0:00:00,2/2/2006 0:00:00))

 #Create lists to store month and year results
 Data.Month_-list()
 Data.Year_-list()
 #pull out year/month attribute at put in own column
 for(i in 1:length(RawData2..$ID)){
     #Select Record
     Data.X-RawData..[i,]
     #Separate date into month, day, and year
     DateSplit-strsplit(Data.X$period_end_date,/)
     #Select month
     Month-unlist(DateSplit)[1]
     #Separate year from time attribute
     Year.X-strsplit(unlist(DateSplit)[3], )
     Year.Y-unlist(Year.X)[1]
     Data.Month_[[i]]-Month
     Data.Year_[[i]]-Year.Y

 }


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Alter-character-attribute-tp3018202p3018202.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alter character attribute

2010-10-28 Thread Phil Spector

If you convert the dates to R date objects, I think things
will be easier:


rawdata2$period_end_date = as.Date(rawdata2$period_end_date,format='%m/%d/%Y')
rawdata2$mon = as.numeric(format(rawdata2$period_end_date,'%m'))
rawdata2$year = as.numeric(format(rawdata2$period_end_date,'%Y'))


(I'm assuming you're using month/date/year.)

I can pretty much guarantee it will run in less than 18 hours :-)

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu


On Thu, 28 Oct 2010, LCOG1 wrote:



Hi everyone

I have some records that include a date attribute for the date and time but
i need to separate the data and analyze it separately in GIS by Month and
Year, so i need to pull these attributes out and create their own attribute
field.

So the input:
RawData2.. returns

 ID   period_end_date
1 22 9/10/2007 0:00:00
2 44  2/2/2006 0:00:00

and i need to get
 ID   period_end_dateMonth Year
22 9/10/2007 0:00:00   9 2007
44  2/2/2006 0:00:0022006

The below gets me this in list form which i can then add back into the
initial data frame BUT
i have over 4.5 million records and when i run the below it ran for more
than 18 hours and only go through about 2.7 millions records when i gave up
and ended the process.

So how can i make this more efficient and possibly add the new attributes
(month/year) to the data frame on the fly.

Thanks guys

#Create sample data
RawData2..-data.frame(ID=c(22,44),period_end_date=c(9/10/2007
0:00:00,2/2/2006 0:00:00))

#Create lists to store month and year results
Data.Month_-list()
Data.Year_-list()
#pull out year/month attribute at put in own column
for(i in 1:length(RawData2..$ID)){
#Select Record
Data.X-RawData..[i,]
#Separate date into month, day, and year
DateSplit-strsplit(Data.X$period_end_date,/)
#Select month
Month-unlist(DateSplit)[1]
#Separate year from time attribute
Year.X-strsplit(unlist(DateSplit)[3], )
Year.Y-unlist(Year.X)[1]
Data.Month_[[i]]-Month
Data.Year_[[i]]-Year.Y

}


--
View this message in context: 
http://r.789695.n4.nabble.com/Alter-character-attribute-tp3018202p3018202.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alter character attribute

2010-10-28 Thread LCOG1

Changing the filed into date format then pulling out the month/year worked
best.  Thanks, i knew it was gonna be easy.

Cheers
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Alter-character-attribute-tp3018202p3018255.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.