On Monday 23 February 2009, 17:05, Mark Knecht wrote: > I'm attaching a small (100 line) data file out of TradeStation. Zipped > it's about 2K. It should expand to about 10K. When I run the command > to get 10 lines put together it works correctly and gives me a file > with 91 lines and about 100K in size. (I.e. - 10x on my disk.) > > awk -v n=10 -f awkScript1.awk awkDataIn.csv >awkDataOut.csv > > No mangling of the first line - that must have been something earlier > I guess. Sorry for the confusion on that front. > > One other item has come up as I start to play with this farther down > the tool chain. I want to use this data in either R or RapidMiner to > data mine for patterns. Both of those tools are easier to use if the > first line in the file has column titles. I had originally asked > TradeStation not to output the column titles but if I do then for the > first line of our new file I should actually copy the first line of > the input file N times. Something like > > For i=1; read line, write N times, write \n > > and then > > for i>=2 do what we're doing right now.
That is actually accomplished just by adding a bit of code: BEGIN {FS=OFS=","} NR==1{for(i=1;i<=n;i++){printf "%s%s", sep, $0;sep=OFS};print""} # header NR>=2{ r=$NF;NF-- for(i=1;i<n;i++){ s[i]=s[i+1] dt[i]=dt[i+1] if((NR>=n+1)&&(i==1))printf "%s%s",dt[1],OFS if(NR>=n+1)printf "%s%s",s[i],OFS } sep=dt[n]="";for(i=1;i<=dropcol;i++){dt[n]=dt[n] sep $i;sep=OFS} sub("^([^,]*,){"dropcol"}","") s[n]=$0 if(NR>=n+1)printf "%s,%s\n", s[n],r } Note that no column is dropped from the header. If you need to do that, just tell us how you want to do that.