On Monday 23 February 2009, 17:05, Mark Knecht wrote:

> I'm attaching a small (100 line) data file out of TradeStation. Zipped
> it's about 2K. It should expand to about 10K. When I run the command
> to get 10 lines put together it works correctly and gives me a file
> with 91 lines and about 100K in size. (I.e. - 10x on my disk.)
>
> awk -v n=10 -f awkScript1.awk awkDataIn.csv >awkDataOut.csv
>
> No mangling of the first line - that must have been something earlier
> I guess. Sorry for the confusion on that front.
>
> One other item has come up as I start to play with this farther down
> the tool chain. I want to use this data in either R or RapidMiner to
> data mine for patterns. Both of those tools are easier to use if the
> first line in the file has column titles. I had originally asked
> TradeStation not to output the column titles but if I do then for the
> first line of our new file I should actually copy the first line of
> the input file N times. Something like
>
> For i=1; read line, write N times, write \n
>
> and then
>
> for i>=2 do what we're doing right now.

That is actually accomplished just by adding a bit of code:

BEGIN {FS=OFS=","}

NR==1{for(i=1;i<=n;i++){printf "%s%s", sep, $0;sep=OFS};print""} # header
NR>=2{
  r=$NF;NF--
  for(i=1;i<n;i++){
    s[i]=s[i+1]
    dt[i]=dt[i+1]
    if((NR>=n+1)&&(i==1))printf "%s%s",dt[1],OFS
    if(NR>=n+1)printf "%s%s",s[i],OFS
  }
  sep=dt[n]="";for(i=1;i<=dropcol;i++){dt[n]=dt[n] sep $i;sep=OFS}
  sub("^([^,]*,){"dropcol"}","")
  s[n]=$0
  if(NR>=n+1)printf "%s,%s\n", s[n],r
}

Note that no column is dropped from the header. If you need to do that, 
just tell us how you want to do that.

Reply via email to