I have many separate data files in csv format for a lot of daily stock
prices. Over a few years there are hundreds of those data files, whose
names are the dates of data record.

In each file there are variables of ticker (or stock trading code), date,
open price, high price, low price, close price, and trading volume. For
example, inside a data file named 20150128.txt it looks like this:

FB,20150128,1.075,1.075,0.97,0.97,725221
AAPL,20150128,2.24,2.24,2.2,2.24,63682
AMZN,20150128,0.4,0.415,0.4,0.415,194900
NFLX,20150128,50.19,50.21,50.19,50.19,761845
GOOGL,20150128,1.62,1.645,1.59,1.63,684835 ...................and many
more..................

In case it's relevant, the number of stocks in these files are not
necessarily the same (so there will be missing data). I need to import and
create 5 separate time series data frames from those files, one each for
Open, High, Low, Close and Volume. In each data frame, rows are indexed by
date, and columns by ticker. For example, the data frame Open may look like
this:

DATE,FB,AAPL,AMZN,NFLX,GOOGL,... 20150128,1.5,2.2,0.4,5.1,1.6,...
20150129,NA,2.3,0.5,5.2,1.7,... ...

What will be an efficient way to do that? I've used the following codes to
read the files into a list of data frames but don't know what to do next
from here.

files = list.files(pattern="*.txt") mydata = lapply(files,
read.csv,head=FALSE)

Thanks,

Nathan

Disclaimer: In case it's relevant, this question is also posted on
stackoverflow.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to