Dear Guy, Have a look at cast() from the reshape package. You'll need something like
cast(fldsampleid ~ Analysis, value = "Result", data = your.data.frame) Best regards, Thierry ------------------------------------------------------------------------ ---- ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek team Biometrie & Kwaliteitszorg Gaverstraat 4 9500 Geraardsbergen Belgium Research Institute for Nature and Forest team Biometrics & Quality Assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey > -----Oorspronkelijk bericht----- > Van: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] Namens Guy Jett > Verzonden: maandag 10 januari 2011 22:00 > Aan: r-help@r-project.org > Onderwerp: [R] Help with Data Transformation > > Greetings, > I am new to R and am having trouble with parsing a file with > the following characteristics: > > * Individual results for a single sample are written > to multiple lines. > > * First 16 columns are constant from sample to sample. > > * Remaining 10 need to be matched up (cross-tabbed?) > > o (the exact contents for the remaining 10 vary from sample > to sample, as indicated in the extract below) > > * Ultimate goal is to run various comparisons between > the variable columns, compare samples from separate > populations, and graph samples from the separate populations. > > * (An extract is provided below) > > The data is initially extracted from an SQL database into > Excel, then saved as a tab-delimited text file for use in R. > I have been successful in using subset() to extract specific > sample types, but have not yet been able to transform the > data so that all the data needed is on a single line. I have > looked at several R manuals, read through 'R in a Nutshell', > prowled the help resources (R Site Search and the Google > link), tried stack(), subset(), reshape(), and several other > functions, to no avail. > > Thank you very much for your help. This seems like a > wonderful community, > Guy Jett, R.G. > Project Geologist > gj...@itsi.com<mailto:gj...@itsi.com> > > Example Data Input (subset): > fldsampid CLP_ID sacode matrix > etc... prccode Lab EXMCODE > Analysis PARLABEL PARVQ Result > 2268 LHR020GW-01E2 N > WG INO BRLS NONE > E300 CL = 23590.9 > 2269 LHR020GW-01E2 N > WG INO BRLS NONE > E300 PO4 ND 50 > 2270 LHR020GW-01E2 N > WG INO BRLS NONE > E300 SO4 = 22460 > 2272 LHR020GW-01E2 N > WG MET BRLS FLDFLT > E1631 HG = 0.00171 > 2273 LHR020GW-01E2 N > WG MET BRLS FLDFLT > E1638 AG = 2.57 > 2274 LHR020GW-01E2 N > WG MET BRLS FLDFLT > E1638 AL = 122 > 2275 LHR020GW-01E2 N > WG MET BRLS FLDFLT > E1638 AS = 317 > 2276 LHR020GW-01E2 N > WG MET BRLS FLDFLT > E1638 B = 9970 > 2289 LHR020GW-01E2 N > WG MET BRLS FLDFLT > E1638 V = 131 > 2290 LHR020GW-01E2 N > WG MET BRLS FLDFLT > E1638 Zn = 1.76 > 2291 LHR020GW-01E2 N > WG MET BRLS METHOD > E1638 PB ND 0.008 > 2292 LHR020GW-01E2 N > WG MI BRLS NONE > A2320 ALK = 807000 > 2293 LHR020GW-01E2 N > WG MI BRLS NONE > A2320 ALKB = 807000 > 2294 LHR020GW-01E2 N > WG MI BRLS NONE > A2320 ALKC ND 2500 > 2295 LHR020GW-01E2 N > WG ORG BRLS NONE > A5310B DOC = 49330 > 2296 LHR020GW-01E2 N > WG SN BRLS NONE > E300 NO3 = 792 > 2326 LHR020SD-00E2 N SE > MET BRLS METHOD > E1630 MEHG = 4.28 > 2327 LHR020SD-00E2 N SE > MI BRLS METHOD > E160.3 SOLID = 48.45 > 2328 LHR020SD-00E2 N SE > ORG BRLS NONE SW9060 > TOC = 4.823 > 2329 LHR020SD-00E2 MY77J8 N SE > MET A4SW METHOD C245.5 > HG = 5100 > 2330 LHR020SD-00E2 MY77J8 N SE > MET A4SW METHOD E200.8 > AG ND 1050 > 2331 LHR020SD-00E2 MY77J8 N SE > MET A4SW METHOD E200.8 > AS = 5500 > 2332 LHR020SD-00E2 MY77J8 N SE > MET A4SW METHOD E200.8 B > = 11400 > 2346 LHR020SD-00E2 MY77J8 N SE > MET A4SW SW3050B SW6010B > V = 56900 > 2349 LHR020SD-00E2 MY77J8 N SE > MI A4SW METHOD A2540G > SOLID = 47.7 > > Desired output: > fldsampid CLP_ID sacode matrix > etc... CL PO4 SO4 AG AL > AS B V Zn > etc... ALK ALKB ALKC SOLID > DOC TOC NO3 > LHR020GW-01E2 N > WG <value> > <value> <value> <value> > <value> <value> <value> > <value> <value> > <value> <value> <value> > <value> <value> <value> > <value> <value> > LHR020SD-00E2 MY77J8 N SE > NA NA NA <value> > <value> <value> > <value> <value> NA > <value> NA NA NA > <value> NA NA NA > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.