Zhang Weiwu wrote: > Hi. > > I got a datasheet from my colleague in MS Excel format and I intend to > process that file with my awk/sed knowledge. The problem is: he sent me > two Excel files each with 2134 records, in fact there should be only one > excel file with 2134 rows and 295 columns, but MS Excel can only handle > 256 data columns, so he split the datasheet vertically so he can manage > to send to me. > > Now I saved both file to tab-separated-value format, how do I join them? > > I could have used join(1) but that require a join field, an ID of some > sort. I think of this: > > $ grep -n '' left.tsv | sed 's/:/\t/'> left.forjoin > $ grep -n '' right.tsv | sed 's/:/\t/'> right.forjoin > $ join -t " " left.forjoin right.forjoin > result.tsv > (note that for join's -t parameter somehow I need to manage to get a tab > between the quotes) > > Yes I achieved what I want, but that looks complex. Is there a simpler > way? Thanks in advance. > > I know OpenOffice 3.0 can handle up to 1024 data columns. It's difficult > to convince anyone to switch to OOO because here in China MS Office > costs only 0$. I also could use OOO3.0 for doing the join but I wish to > know the commandline way:) > Got perl?
#!/usr/bin/perl if($#ARGV < 1) { print "Arguments: <file1> <file2>\n"; exit(1); } open(FIRSTFILE, $ARGV[0]); open(SECONDFILE, $ARGV[1]); @first = <FIRSTFILE>; @second = <SECONDFILE>; $i = 0; for($i = 0;$i < 2; $i++) { $tmp1 = $first[$i]; $tmp1 =~ s/\n//g; $tmp2 = $second[$i]; $tmp2 =~ s/\n//g; $str = $tmp1 . "\t" . $tmp2 . "\n"; print $str; } close(FIRSTFILE); close(SECONDFILE); This is likely not the best or fastest way to do it, and I don't have a dataset as large as yours readily available for testing, but it seems to work. -Tim -- gentoo-user@lists.gentoo.org mailing list