Re: [R] Performance (speed) of ggplot

2014-09-26 Thread Jeff Newmiller
You are asked in the Posting Guide to provide a small, reproducible example. 
Your function is Byzantine, and depends on who knows what, and I can't even see 
what your end product is intended to be.

I will say that I think you have missed the point with your approach to using 
ggplot... you might well do better with base graphics if coding each display 
element is necessary for your application.

ggplot is intended for a data-driven approach, where you set up data frames 
that contain the bulk of your graphic information, and then you should rarely 
need more than one call for each type of graphic element in a given plot.

That said, ggplot can be noticeably slow sometimes, so I cannot predict whether 
you will achieve your stated speed goal by reconfiguring the code at this point.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On September 25, 2014 11:30:08 AM PDT, Christopher Battles 
cbatt...@startllc.com wrote:
Hello list,

I have been working on learning ggplot for its extraordinary
flexibility compared to base plotting and have been developing a
function to create a Minitab-like process capability chart.  

*sigh* some of the people I interface with can only understand the data
when it is presented in Minitab format

The function creates a ggplot container to hold 10 ggplot items which
are the main process capability chart, a Q-Q plot, and the text boxes
with all the capabilities data.  When I run the function, the elapsed
time is on the order of 3 seconds, the gross majority of which is user
time.  sys time is very small.  A bit of hacking shows that the calls
to 

gt1 - ggplot_gtable(ggplot_build(p)), 

etc., each take on the order of 1/3 of a second. These times are on a
3.2GHz Xeon workstation.  I'd like to see the entire function complete
in less than a second.  My questions are: 1) Am I misusing ggplot,
hence the performance hit? 2) Is there any way to increase the speed of
this portion of the code? 3) Am I simply asking ggplot to crunch so
much that it is inevitable that it will take a while to process?

To that end, the function, vectis.cap(), can be downloaded from
http://pastebin.com/05s5RKYw .  It runs to 962 lines of code, so I
won't paste it here.  The offending ggplot_gtable calls are at lines
909 - 918.

Usage:
vectis.cap(chickwts$weight, target = 300, USL = 400, LSL = 100)

Thank you,

Christopher Battles

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error in rownames

2014-09-26 Thread PIKAL Petr
Hi.

Partly. You got response from David regarding HTML formating. I polished 
scrammbled text but I got

 AsciiGridImpute(type.rf, xfile, outfile)
Error in file(xfiles[[i]], open = rt) : cannot open the connection
In addition: Warning message:
In file(xfiles[[i]], open = rt) :
  cannot open file 'sinaspect.asc': No such file or directory


So still something is missing.

Regards
Petr


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Chris Jackson-Jordan
 Sent: Thursday, September 25, 2014 5:57 PM
 To: Adams, Jean
 Cc: R help
 Subject: Re: [R] error in rownames

 Sorry to respond again so quickly, but is this what you meant?


 y - subset(training, select = c(ResponseSu)) x - subset(training,
 select = c(sinaspect, habitat, elevation, disttowat, disttoroad,
 slope, cosaspect)) type.rf - yai(x=x, y=y, method=randomForest,
 rfMode=regression, ntree=20) outfile -
 list(Type=RespSurf_Reg.asc) xfile -list(sinaspect=sinaspect.asc,
 habitat=habitat.asc, elevation=elevation.asc,+
 disttowat=disttowat.asc, disttoroad=disttoroad.asc,
 slope=slope.asc,+ cosaspect=cosaspect.asc) dput(training[1:60,
 ])structure(list(CID = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
 0L, 0L), cosaspect = c(-0.546873, 0.57, 0.796074, -0.906097,
 0.546751, -0.903426, 0.55381, 0.761338, 0.858955, -0.784384,
 0.790582, 0.0429794, 0.89785, 0.998368, 0.991524, -0.769041,
 -0.104549, -0.565003, -0.56414, 0.432783, -0.652663, -0.477157,
 -0.642625, 0.0719997, 0.0202113, -0.88, -0.38204, -0.3565,
 -0.199093, 0.545715, -0.99322, 0.852507, -0.524166, 0.999378,
 -0.980175, -0.922242, -0.954118, -0.902187, -0.354786, 0.112082,
 -0.198151, -0.502453, -0.676458, -0.0673507, -0.999526, -0.840507,
 0.989883, -0.832413, -0.733279, -0.993005, -0.479475, -0.326237,
 -0.848538, -0.546614, -0.986488, -0.948983, -0.993527, -0.964193,
 -0.971359, -0.287455), disttoroad = c(368.767, 880.661, 563.239,
 72.1515, 227.697, 313.486, 265.001, 227.697, 219.137, 518.349,
 23.048, 10.3074, 41.2294, 103.074, 497.857, 494.968, 524.968,
 674.01, 10.3074, 133.996, 10.3074, 10.3074, 10.3074, 20.6147,
 226.058, 10.3074, 272.122, 226.762, 177.933, 204.075, 82.4589,
 61.8442, 14.5768, 20.6147, 29.1536, 92.1918, 75.0387, 113.849,
 133.996, 46.0959, 123.688, 120.645, 185.533, 133.996, 61.8442,
 103.074, 120.203, 154.61, 138.671, 150.077, 51.5368, 190.058,
 72.884, 20.6147, 115.24, 30.9221, 83.1006, 333.041, 30.9221,
 288.606), disttowat = c(981.15, 1448.98, 729.496, 509.146, 171.549,
 322.671, 566.998, 631.783, 783.359, 407.76, 1036.59, 352.264,
 306.113, 235.27, 241.289, 269.769, 342.633, 660.476, 578.04,
 1844.99, 1753.65, 408.801, 543.951, 756.661, 632.875, 540.031,
 927.09, 1120.47, 1166.24, 65.9993, 474.586, 437.911, 162.974,
 576.66, 557.932, 547.844, 793.667, 420.586, 208.708, 10.3074,
 497.857, 383.595, 371.78, 130.379, 449.406, 111.014, 162.974,
 393.574, 463.831, 360.021, 404.096, 43.7304, 140.195, 182.647,
 277.343, 84.9967, 87.4609, 633.546, 60.1017, 42.4983), elevation =
 c(2000.9,
 2198.18, 2194.78, 1863.8, 1902.25, 1867.29, 1943.94, 1912.35,
 1920.11, 1870.94, 1914.56, 1904.4, 1900.92, 1893.94, 1888.02,
 1889.71, 1890.95, 1894.32, 1896.11, 2011.99, 2025.01, 1902.45,
 1901.99, 2030.09, 2038.39, 1910.48, 2054.46, 1918.92, 1951.15,
 1941.24, 1853.78, 1863.76, 1875.9, 1863.6, 1864.91, 1864.18,
 1872.85, 1866.81, 1863.06, 1896.47, 1863.95, 1873.13, 1873.44,
 1888.06, 1874.35, 1888.06, 1911.09, 1872.43, 1854.83, 1866.98,
 1876.95, 1875.88, 1875.19, 1888.27, 1900.82, 1870.49, 1903.17,
 1871.52, 1902.28, 1862.18), habitat = c(14L, 1L, 9L, 2L, 5L,
 2L, 5L, 1L, 1L, 2L, 1L, 4L, 10L, 8L, 2L, 2L, 2L, 2L, 4L, 10L,
 5L, 4L, 4L, 3L, 5L, 4L, 3L, 3L, 7L, 6L, 2L, 2L, 2L, 13L, 2L,
 2L, 2L, 2L, 2L, 11L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
 2L, 2L, 2L, 2L, 10L, 2L, 2L, 2L, 2L, 2L), sinaspect = c(-0.837215,
 -0.942801, -0.6052, 0.42307, -0.837295, -0.428745, -0.832643,
 -0.648355, -0.512051, -0.620276, -0.612357, -0.999076, -0.440302,
 -0.0571153, 0.129924, -0.6392, -0.99452, -0.825089, 0.825679,
 -0.901498, -0.757648, 0.878818, 0.766181, -0.997405, -0.999796,
 0.895785, 0.924146, 0.934295, 0.979981, -0.837971, 0.11625, -0.522717,
 -0.851616, -0.0352723, 0.198133, -0.386614, -0.29943, -0.431345,
 -0.934948, -0.993699, -0.980172, -0.864605, -0.736481, 0.997729,
 -0.0307994, -0.541801, 0.141883, -0.554156, 0.679928, 0.118074,
 -0.877556, -0.945288, -0.529134, -0.837385, -0.163837, -0.315329,
 -0.113595, 0.265202, -0.237618, -0.957794), slope = c(28.8435,
 2.74077, 34.8524, 0.151366, 18.6671, 0.132943, 17.6102, 16.0996,
 9.36435, 0.452752, 1.9657, 0.996516, 1.74535, 0.148506, 0.0372119,
 0.695029, 0.455934, 0.514605, 12.6356, 16.0089, 8.21627, 1.038,
 0.498378, 3.6306, 4.11395, 10.3384, 15.5258, 16.5868, 30.8142,
 

Re: [R] Median of streaming data

2014-09-26 Thread Martin Maechler
 Rolf Turner r.tur...@auckland.ac.nz
 on Thu, 25 Sep 2014 11:44:38 +1200 writes:

 On 24/09/14 20:16, Martin Maechler wrote: SNIP

 1) has your proposal ever been provided in R?  I'd be
 happy to add it to the robustX
 (http://cran.ch.r-project.org/web/packages/robustX) or
 even robustbase
 (http://cran.ch.r-project.org/web/packages/robustbase)
 package.

 SNIP

 I have coded up the algorithm from the Cameron and Turner
 paper.  Dunno if it gives exactly the same results as my
 (Splus?) code from lo these many years ago (the code that
 is lost in the mists of time), but it *seems* to work.

excellent, thank you, Rolf!

 It is not designed to work with actual streaming data
 --- I don't know how to do that.  It takes a complete data
 vector as input.  Someone who knows about streaming data
 should be able to adapt it pretty easily.  Said he, the
 proverbial optimist.

I agree; that should not be hard. 
One way is to replace   'y[ind]' by   'getY(ind)' everywhere in the code
and let 'getY' be an argument to rlas() provided by the user.

 The function code and a help file are attached.  These
 files have had their names changed to end in .txt so
 that they will get through the mailing list processor
 without being stripped.  With a bit of luck.
;-)

It did work indeed.
I've added them to  'robustX' -- on R-forge,
including a plot() method and some little more flexibility.

  -- https://r-forge.r-project.org/R/?group_id=59

Thank you for all the other pointers to litterature (but none to
software), some of which quite recent.


One old idea that was not directly mentioned I think is the
Remedian  of Rouseeuw and Basset:

  Peter J. Rousseeuw and Gilbert W. Bassett, Jr. (1998)
  The Remedian: A Robust Averaging Method for Large Data Sets
  Journal of the American Statistical Association, Vol. 85, No. 409, pp. 97-104
  [URL: http://www.jstor.org/stable/2289530]

which is also easy to implement and I plan to add to robustbase
(as I'd want to use the C code already in robustbase) as a
reference estimator.

Personally, I think there is quite some room for research and
implementation, not the least because the litterature seems to
always be a bit incomplete {one school not knowning about, or
at least not citing works of the other school, etc...}

Martin

--
Martin Maechler, ETH Zurich


 If they *don't* get through, anyone who is interested
 should contact me and I will send them to you privately.

 cheers,
 Rolf

 -- 
 Rolf Turner Technical Editor ANZJS 
 
 external: rlas.R, plain text]
 external: rlas.Rd, plain text]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Performance (speed) of ggplot

2014-09-26 Thread PIKAL Petr
Hi

I will second Jeff. If you want several plots on one page it could be more 
convenient to use standard plot.

see
?layout or ?split.screen
for complex figures

or ?par mfrow

for simple layout.

In all cases it is difficult to combine base and grid graphics though.

Regards
Petr

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Jeff Newmiller
 Sent: Friday, September 26, 2014 8:02 AM
 To: Christopher Battles; r-help@r-project.org
 Subject: Re: [R] Performance (speed) of ggplot

 You are asked in the Posting Guide to provide a small, reproducible
 example. Your function is Byzantine, and depends on who knows what, and
 I can't even see what your end product is intended to be.

 I will say that I think you have missed the point with your approach to
 using ggplot... you might well do better with base graphics if coding
 each display element is necessary for your application.

 ggplot is intended for a data-driven approach, where you set up data
 frames that contain the bulk of your graphic information, and then you
 should rarely need more than one call for each type of graphic element
 in a given plot.

 That said, ggplot can be noticeably slow sometimes, so I cannot predict
 whether you will achieve your stated speed goal by reconfiguring the
 code at this point.
 ---
 
 Jeff NewmillerThe .   .  Go
 Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
 Go...
   Live:   OO#.. Dead: OO#..
 Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.
 rocks...1k
 ---
 
 Sent from my phone. Please excuse my brevity.

 On September 25, 2014 11:30:08 AM PDT, Christopher Battles
 cbatt...@startllc.com wrote:
 Hello list,
 
 I have been working on learning ggplot for its extraordinary
 flexibility compared to base plotting and have been developing a
 function to create a Minitab-like process capability chart.
 
 *sigh* some of the people I interface with can only understand the
 data
 when it is presented in Minitab format
 
 The function creates a ggplot container to hold 10 ggplot items which
 are the main process capability chart, a Q-Q plot, and the text boxes
 with all the capabilities data.  When I run the function, the elapsed
 time is on the order of 3 seconds, the gross majority of which is user
 time.  sys time is very small.  A bit of hacking shows that the calls
 to
 
 gt1 - ggplot_gtable(ggplot_build(p)),
 
 etc., each take on the order of 1/3 of a second. These times are on a
 3.2GHz Xeon workstation.  I'd like to see the entire function complete
 in less than a second.  My questions are: 1) Am I misusing ggplot,
 hence the performance hit? 2) Is there any way to increase the speed
 of
 this portion of the code? 3) Am I simply asking ggplot to crunch so
 much that it is inevitable that it will take a while to process?
 
 To that end, the function, vectis.cap(), can be downloaded from
 http://pastebin.com/05s5RKYw .  It runs to 962 lines of code, so I
 won't paste it here.  The offending ggplot_gtable calls are at lines
 909 - 918.
 
 Usage:
 vectis.cap(chickwts$weight, target = 300, USL = 400, LSL = 100)
 
 Thank you,
 
 Christopher Battles
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že 

Re: [R] Performance (speed) of ggplot

2014-09-26 Thread ONKELINX, Thierry
You are using ggplot2 very inefficiently. Many geom's plot only one data point. 
You can combine several of them in a single geom. Have a look at this gridExtra 
package which has some useful functions like grid.arrange and tableGrob.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
Christopher Battles
Verzonden: donderdag 25 september 2014 20:30
Aan: r-help@r-project.org
Onderwerp: [R] Performance (speed) of ggplot

Hello list,

I have been working on learning ggplot for its extraordinary flexibility 
compared to base plotting and have been developing a function to create a 
Minitab-like process capability chart.

*sigh* some of the people I interface with can only understand the data when it 
is presented in Minitab format

The function creates a ggplot container to hold 10 ggplot items which are the 
main process capability chart, a Q-Q plot, and the text boxes with all the 
capabilities data.  When I run the function, the elapsed time is on the order 
of 3 seconds, the gross majority of which is user time.  sys time is very 
small.  A bit of hacking shows that the calls to

gt1 - ggplot_gtable(ggplot_build(p)),

etc., each take on the order of 1/3 of a second. These times are on a 3.2GHz 
Xeon workstation.  I'd like to see the entire function complete in less than a 
second.  My questions are: 1) Am I misusing ggplot, hence the performance hit? 
2) Is there any way to increase the speed of this portion of the code? 3) Am I 
simply asking ggplot to crunch so much that it is inevitable that it will take 
a while to process?

To that end, the function, vectis.cap(), can be downloaded from 
http://pastebin.com/05s5RKYw .  It runs to 962 lines of code, so I won't paste 
it here.  The offending ggplot_gtable calls are at lines 909 - 918.

Usage:
vectis.cap(chickwts$weight, target = 300, USL = 400, LSL = 100)

Thank you,

Christopher Battles

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Can't seem to get lapply to work.

2014-09-26 Thread Keith S Weintraub
Jeff,
Of course a data.frame is a list of columns. Duh!

Sorry about that. Was working about 10 hours straight.

Also sorry about the excess attributes. They were created by via expand.grid.

Here is my current solution which seems to be working:

lapply(1:nrow(tstSet), function(x, y, z) oneRun(y[x,],z), tstSet, inp)

Thanks so much for your timely help,
Best,
KW


--

On Sep 26, 2014, at 12:23 AM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote:

 On suggestion is to tell us what you want to accomplish, because that error 
 makes perfect sense to me and your intent is rather opaque. Data frames are 
 not lists of rows, they are lists of columns.
 
 One solution could be
 
 tstSet[[term]] + 1
 
 Another might be
 
 tstSet - within(tstSet,{term - term +1})
 
 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 --- 
 Sent from my phone. Please excuse my brevity.
 
 On September 25, 2014 8:25:44 PM PDT, Keith S Weintraub kw1...@gmail.com 
 wrote:
 Folks,
 
 I have the following problem.
 
 tstSet-structure(list(corr = c(0.59, 0.62), term = c(7, 7), am =
 c(AmYes, 
 AmNo), prem = c(19.5, 14.75)), .Names = c(corr, term, am, 
 prem), out.attrs = structure(list(dim = structure(c(3L, 2L, 
 2L, 41L), .Names = c(corr, term, am, prem)), dimnames =
 structure(list(
  corr = c(corr=0.59, corr=0.62, corr=0.65), term = c(term=5, 
   term=7), am = c(am=AmYes, am=AmNo), prem = c(prem=10.00, 
 prem=10.25, prem=10.50, prem=10.75, prem=11.00, prem=11.25, 
 prem=11.50, prem=11.75, prem=12.00, prem=12.25, prem=12.50, 
 prem=12.75, prem=13.00, prem=13.25, prem=13.50, prem=13.75, 
 prem=14.00, prem=14.25, prem=14.50, prem=14.75, prem=15.00, 
 prem=15.25, prem=15.50, prem=15.75, prem=16.00, prem=16.25, 
 prem=16.50, prem=16.75, prem=17.00, prem=17.25, prem=17.50, 
 prem=17.75, prem=18.00, prem=18.25, prem=18.50, prem=18.75, 
  prem=19.00, prem=19.25, prem=19.50, prem=19.75, prem=20.00
   )), .Names = c(corr, term, am, prem))), .Names = c(dim, 
 dimnames)), row.names = c(460L, 239L), class = data.frame)
 
 tstSet
   corr termam  prem
 460 0.597 AmYes 19.50
 239 0.627  AmNo 14.75
 
 A function:
 
 twoRun-function(aRow, inp) {
 aRow[[term]] + inp
 }
 
 When I try the following:
 lapply(tstSet, twoRun, 1)
 
 I get the following error:
 Error in aRow[[term]] : subscript out of bounds
 
 Any suggestions?
 
 Thanks much for your time,
 KW
 
 
 
 
 --
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with conversion of variable labels (Hmisc and Epicalc)

2014-09-26 Thread Otto Pichlhöfer
For my epidemiological analysis I use the packages Epicalc and Hmisc – 
among others. Both packages allow to assign variable labels that will 
appear in the output of the respective packages’ own functions. The modes 
of storage of the labels in a dataframe are very different. I am 
wonderiung if there is a function that would allow to easily convert 
Epicalc labels to Hmisc labels and possibly the other way around. 

Unfortunately I am not adept enough in R to write such a function myself.

--
Otto Pichlhöfer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Savitzky-Golay Smoother

2014-09-26 Thread Erick Okuto
Dear Paul and Henrik,
I have a time series with some missing data points that i need smoothed
using Savitzky-Golay filter. Related question was asked  here
http://thr3ads.net/r-help/2012/11/2121748-Savitzky-Golay-filtering-with-missing-data
but no straight forward answer was posted. However, Henrik (cc'd here) did
ask related question on smoothing for reflectance here
http://thr3ads.net/r-help/2004/02/835137-Savitzky-Golay-smoothing-for-reflectance-data
which i have as well been unable to follow up. I will be glad if you could
assist me with some insights on the way forward or point to a relevant
source of help.

I have attached a simple test data plus the R code which only work with non
missing data. I will appreciate any possible assistance.

Thanks,
Erick.
-- 
*Erick Okuto, Ph.D.*
* Candidate*
*School of Mathematics  Actuarial Science*
*Jaramogi Oginga Odinga University of Science  Technology (JOOUST)/ World
Agroforestry Centre (ICRAF), **Climate Change Unit, Nairobi-Kenya. *
Voice:  +254207224154​Mobile: +254725005276Skype id:  erickokuto
Email: erickok...@gmail.com
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [rJava] RJavaClassLoader and system classloader

2014-09-26 Thread Benoît Thiébault
Hi everyone,

I want to call a Java application from R and have encountered some problems 
with the way rJava deals with the system class loader.

To run my application, I use the following R script:

 library(rJava)
 .jinit()
 .jaddClassPath(myApp.jar)
 rWrapper - .jnew(org/test/RWrapper)
 .jcall(rWrapper,V,start)

My Java application has a plugins loading mechanism that uses a specific 
PluginClassLoader to load plugins stored in additional JAR files (e.g. 
plugin.jar). This PluginClassLoader is programed so that it knows and loads the 
plugins JARs. As any Java classloader, it is a child of the system class loader.

Finally, plugins not only use classes stored in their plugin.jar file, they 
also depend on classes contained in the main myApp.jar file (the Plugin 
interface is for example defined in the myApp.far)

In a pure Java environment, myApp.jar is known from the system class loader (it 
is in the classpath) and thus the PluginClassLoader can load classes from the 
plugin (it knows both about the plugin.jar and the myApp.jar files)

In the R context however, using the above script, its the RJavaClassLoader that 
knows about the myApp.jar file. The org.test.RWrapper class is instatiated by 
the RJavaClassLoader. The system class loader however does not know about 
myApp.jar anymore. Neither does the PluginClassLoader. So when the 
PluginClassLoader loads a plugin class, it can only load classes that are in 
the plugin.jar file and as soon as a class from the myApp.jar file is required 
by the plugin, it crashes with a java.lang.NoClassDefFoundError message.

My question is: how can I force rJava to load the classpath in the system 
classloader and not only in the RJavaClassLoader?

I cannot make the PluginClassLoader know the RJavaClassLoader as my application 
also has to run in a non-R environment.

Thanks for your time,

Kind regards,

Ben
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with conversion of variable labels (Hmisc and Epicalc)

2014-09-26 Thread David Winsemius

On Sep 26, 2014, at 3:21 AM, Otto Pichlhöfer wrote:

 For my epidemiological analysis I use the packages Epicalc and Hmisc – 
 among others. Both packages allow to assign variable labels that will 
 appear in the output of the respective packages’ own functions. The modes 
 of storage of the labels in a dataframe are very different. I am 
 wonderiung if there is a function that would allow to easily convert 
 Epicalc labels to Hmisc labels and possibly the other way around. 
 
 Unfortunately I am not adept enough in R to write such a function myself.

You needt to use the attributes function to display hte names and structures of 
the labels in the two packages. Firts look at help(pack=Hmisc) and 
help(pack=Epicalc) to see how the package-specific functions are documented.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package loadable in R 3.1.1 Rterm but not in emacs/ESS

2014-09-26 Thread Christopher W Ryan
I'm running R on Windows 7. Clean install on a brand new computer
yesterday.  I installed Protext then R then Vincent Goulet's emacs
with ESS, in that order.

I then installed some R packages, in the R terminal window.  Among them was car

Today I opened emacs, hit M-x R to start an R session, and tried to
load the car library. Results looked like the following:


R version 3.1.1 (2014-07-10) -- Sock it to Me
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: i386-w64-mingw32/i386 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

  options(chmhelp=FALSE, help_type=text)
 options(STERM='iESS', str.dendrogram.last=', editor='emacsclient.exe', 
 show.error.locations=TRUE)

 library(car)
Error in library(car) : there is no package called 'car'




However, in an R terminal (that is, outside of emacs) the car package
loads fine:

R version 3.1.1 (2014-07-10) -- Sock it to Me
Copyright (C) 2014 The R Foundation for Statistical Computing
Platform: i386-w64-mingw32/i386 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

 library(car)


I have not encountered this before and am confused. Why would R 3.1.1
in its terminal see a library, whereas it would not in emacs?  (car
is just an example; the same thing happens with zoo, stringr, Hmisc,
and others.)

Thanks.

--Chris Ryan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question regarding pcalg package

2014-09-26 Thread Jack Luo
Hi,

I am trying to use the pcalg package to do some causal inference on some
high dimensional omics data (~ 30k variables). It takes forever to run and
my machine get stuck. I just realized that conditional independence test
can be calculated without using the correlation matrix (I think it's just
looping through all the partial correlation calculation), I am wondering
after getting all the edges (maybe in a format of two column matrix with
source and targets in each one), then how to use pc, fci, rfci, etc to do
the causal inference. Right now it seems that we need to supply a
correlation matrix into those algorithms in the sufficient statistics
argument.

Thanks,

-Jack

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Savitzky-Golay Smoother

2014-09-26 Thread Paul Kienzle
You can perform the equivalent filtering by hand using something like:

k = (window_length - 1) / 2
for i in k to len(x) - k
xh,yh = x[points-k:points+k], y[points-k:points+k]
idx = isfinite(xh) and isfinite(yh)
xh,yh = xh[idx],yh[idx]
p = polyfit(xh,yh, degree=filter_order)
filtered_y[i] = polyval(p,x[i])

Points at the beginning use xh,yh from 1 to window_length, and at the end
from n-window_length to n.

Savitsky-Golay uses the fact that the x points are equally spaced to
precompute the weights on an FIR filter, so that

filtered_y[i] = yh dot filter_coefficients

The FIR filter coefficients for order p on window 2k+1 are determined from
the top row of the pseudo inverse of the overdetermined polynomial equation
C whose elements, Cij are (-k+i)^(p-j) for i in 0 to 2k, j in 0 to p.  For
an order 2 solver with 5 points, this would be:

4 -2 1
1 -1  1
0  0  1
1  1  1
4  2  1

Computing the pseudo-inverse, this gives filter coefficients:

[ 0.14285714, -0.07142857, -0.14285714, -0.07142857,  0.14285714]

If the central point were NA, the coefficients would have to be:

[ 0.1667, -0.1667, NA, -0.1667,  0.1667]

Sliding the window forward one, the coefficients become:

[ 0.11363636, NA, -0.18181818, -0.09090909,  0.15909091]

Clearly, there is not going to be a trivial operation on the data stream or
filter coefficients which will handle the missing data.

You have a few options.

(1) if your data set is small, just do the filter manually.
(2) if your data set is large but the missing data is sparse, replace
missing data by infinity, then run the filter as normal.  For each
nonfinite number in the result, compute the filtered value manually.
(3) if you are not too concerned about full precision (this is, after all,
a smoothing operation), you could try replacing each missing value with the
average of the surrounding values, and hopefully this will not disturb the
polynomial approximations for the surrounding points too much.

- Paul


On Fri, Sep 26, 2014 at 3:32 AM, Erick Okuto erickok...@gmail.com wrote:

 Dear Paul and Henrik,
 I have a time series with some missing data points that i need smoothed
 using Savitzky-Golay filter. Related question was asked  here
 http://thr3ads.net/r-help/2012/11/2121748-Savitzky-Golay-filtering-with-missing-data
 but no straight forward answer was posted. However, Henrik (cc'd here) did
 ask related question on smoothing for reflectance here
 http://thr3ads.net/r-help/2004/02/835137-Savitzky-Golay-smoothing-for-reflectance-data
 which i have as well been unable to follow up. I will be glad if you could
 assist me with some insights on the way forward or point to a relevant
 source of help.

 I have attached a simple test data plus the R code which only work with
 non missing data. I will appreciate any possible assistance.

 Thanks,
 Erick.
 --
 *Erick Okuto, Ph.D.*
 * Candidate*
 *School of Mathematics  Actuarial Science*
 *Jaramogi Oginga Odinga University of Science  Technology (JOOUST)/ World
 Agroforestry Centre (ICRAF), **Climate Change Unit, Nairobi-Kenya. *
 Voice:  +254207224154​Mobile: +254725005276Skype id:  erickokuto
 Email: erickok...@gmail.com






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Histogram from a single column of a data frame

2014-09-26 Thread Richard Lerner
Column 7 of oded is Breed. If I enter
 summary(Breed)
I get the counts of the numbers in each breed.
However, if I enter


I have tried

 hist($Breed)
Error: unexpected '$' in hist($

 hist(Breed)
Error in hist(Breed) : object 'Breed' not found

 hist(Breed)
Error in hist.default(Breed) : 'x' must be numeric

FYI

 colnames(oded)
[1] Subject.Name   Date   Species
AgeSex
[6] SpayedNeutered Breed  Breed.Code

How do I get a histogram of the counts in column 7?


Thanks
Richard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with conversion of variable labels (Hmisc and Epicalc)

2014-09-26 Thread David Winsemius


On Sep 26, 2014, at 4:21 AM, Otto Pichlhöfer wrote:


For my epidemiological analysis I use the packages Epicalc and Hmisc –
among others. Both packages allow to assign variable labels that will
appear in the output of the respective packages’ own functions. The  
modes

of storage of the labels in a dataframe are very different. I am
wonderiung if there is a function that would allow to easily convert
Epicalc labels to Hmisc labels and possibly the other way around.

Unfortunately I am not adept enough in R to write such a function  
myself.


Here's some further explorations on the topic. You should be able to  
see that epicalc (not Epicalc) uses labeling at a dataframe levels  
of attribute assignment while rms/Hmisc uses a column-level attribute  
creation. So you could actually have both systems working in the same  
dataframe. (Which could get pretty confusing if your short-term memory  
is as limited as mine.)


require(epicalc)
require(rms) # which loads Hmisc
# This is the first section of the examples in
 sbp - c(120, 100, 110, 120, 140, 120,  NA,  NA)
 dbp - c( 80,  80,  70,  80,  70,  NA,  70,  60)
 .data - data.frame(sbp, dbp)
 use(.data)

I'm guessing that this `use`-function is sort of like R's attach  
function
I strongly recommend against using `attach` and I suspect also against  
using `use`


 pack()   # I have no idea what that does.
 des()   # or that

 No. of observations =  8
  Variable  Class   Description
1 sbp   numeric
2 dbp   numeric
 label.var(sbp, systolic BP)
 str(.data$sbp)
 num [1:8] 120 100 110 120 140 120 NA NA
 str(.data)
'data.frame':   8 obs. of  2 variables:
 $ sbp: num  120 100 110 120 140 120 NA NA
 $ dbp: num  80 80 70 80 70 NA 70 60
 - attr(*, var.labels)= chr  systolic BP 

That shows that the attributes are assigned to the dataframe by  
epicalc's `var.labels`


 ?label
 label(.data$sbp) - test.sbp
 str(.data)
'data.frame':   8 obs. of  2 variables:
 $ sbp:Classes 'labelled', 'numeric'  atomic [1:8] 120 100 110 120  
140 120 NA NA

  .. ..- attr(*, label)= chr test.sbp
 $ dbp: num  80 80 70 80 70 NA 70 60
 - attr(*, var.labels)= chr  systolic BP 

And that shows that `Hmisc::label` does its assignment to the  
individual column vector.


If you wanted to assign Hmisc-type labels to the columns of a  
dataframe that has some or all of its columns lableded in the epicalc  
method then this loop succeeds:


 for( coln in seq_along(.data) ) { label(.data[[coln]]) -  
attr(.data, var.labels)[coln]}

 str(.data)
'data.frame':   8 obs. of  2 variables:
 $ sbp:Classes 'labelled', 'numeric'  atomic [1:8] 120 100 110 120  
140 120 NA NA

  .. ..- attr(*, label)= chr systolic BP
 $ dbp:Classes 'labelled', 'numeric'  atomic [1:8] 80 80 70 80 70 NA  
70 60

  .. ..- attr(*, label)= chr 
 - attr(*, var.labels)= chr  systolic BP 

--

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Histogram from a single column of a data frame

2014-09-26 Thread William Dunlap
Try
  hist(oded$Breed)

(I suspect that summary(Breed) does not work in your current session either -
perhaps you had a dataset named just Breed or had attached the data.frame
oded in the session where summary(Breed) works.)

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Fri, Sep 26, 2014 at 10:57 AM, Richard Lerner richa...@uchicago.edu wrote:
 Column 7 of oded is Breed. If I enter
 summary(Breed)
 I get the counts of the numbers in each breed.
 However, if I enter


 I have tried

 hist($Breed)
 Error: unexpected '$' in hist($

 hist(Breed)
 Error in hist(Breed) : object 'Breed' not found

 hist(Breed)
 Error in hist.default(Breed) : 'x' must be numeric

 FYI

 colnames(oded)
 [1] Subject.Name   Date   Species
 AgeSex
 [6] SpayedNeutered Breed  Breed.Code

 How do I get a histogram of the counts in column 7?


 Thanks
 Richard

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Histogram from a single column of a data frame

2014-09-26 Thread Rolf Turner

On 27/09/14 05:57, Richard Lerner wrote:

Column 7 of oded is Breed. If I enter

summary(Breed)

I get the counts of the numbers in each breed.
However, if I enter


I have tried


hist($Breed)

Error: unexpected '$' in hist($


hist(Breed)

Error in hist(Breed) : object 'Breed' not found


hist(Breed)

Error in hist.default(Breed) : 'x' must be numeric

FYI


colnames(oded)

[1] Subject.Name   Date   Species
AgeSex
[6] SpayedNeutered Breed  Breed.Code

How do I get a histogram of the counts in column 7?



Are you for real, or is this some kind of joke?

Assuming it's not a joke:

(1) Look at ?$.

(2) WTF do you expect hist($Breed) to do?  Where is Breed going to be 
found, for pity's sake?  You appear to have no understanding of how R 
(or any programming language) works.


Read the Introduction to R manual and *get* some understanding if you 
are going to use R.


(3) The correct usage obviously (and I mean ***really***; it *IS* 
obvious) is hist(oded$Breed) --- which tells hist() where to find Breed.


(4) You could also use

with(oded,hist(Breed))

which is a useful syntax in some circumstances.

cheers,

Rolf Turner

--
Rolf Turner
Technical Editor ANZJS

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Median of streaming data

2014-09-26 Thread Rolf Turner

On 26/09/14 21:48, Martin Maechler wrote:

Rolf Turner r.tur...@auckland.ac.nz


SNIP



  I have coded up the algorithm from the Cameron and Turner
  paper.  Dunno if it gives exactly the same results as my
  (Splus?) code from lo these many years ago (the code that
  is lost in the mists of time), but it *seems* to work.

excellent, thank you, Rolf!

  It is not designed to work with actual streaming data
  --- I don't know how to do that.  It takes a complete data
  vector as input.  Someone who knows about streaming data
  should be able to adapt it pretty easily.  Said he, the
  proverbial optimist.

I agree; that should not be hard.
One way is to replace   'y[ind]' by   'getY(ind)' everywhere in the code
and let 'getY' be an argument to rlas() provided by the user.

  The function code and a help file are attached.  These
  files have had their names changed to end in .txt so
  that they will get through the mailing list processor
  without being stripped.  With a bit of luck.
;-)

It did work indeed.
I've added them to  'robustX' -- on R-forge,
including a plot() method and some little more flexibility.

   -- https://r-forge.r-project.org/R/?group_id=59




SNIP

Since I posted my previous email I have found some typos in the 
documentation and made some adjustments to the code.


I also realized that the name rlas sounds a bit too much like a random 
number generator, according to R's conventions.  So I have decided that 
lasr would be a better name.


I hope this change of horses in midstream doesn't mess things up too 
much for you.  If it does, of course feel free to stick with rlas.


I have attached revised *.R and *.Rd files.  Not sure that the *.Rd file 
will get through as-is.  Please let me know if it doesn't.


cheers,

Rolf


--
Rolf Turner
Technical Editor ANZJS
lasr - function(y,b=0.2,mfn=function(n){0.1*n^(-0.25)},
 nstart=30,scon=NULL) {
# Initialize:
y0- y[1:nstart]
alpha - median(y0)
s - if(is.null(scon)) mean(abs(y0-alpha)) else scon
mu- mfn(nstart)/s
eps   - s/nstart^b
kount - sum(abs(alpha-y0)  eps)
g - kount/(eps*nstart)
ny- length(y)
n - nstart+1
locn  - numeric(ny)
locn[1:(nstart-1)] - NA
locn[nstart] - alpha
scale  - numeric(ny)
scale[1:(nstart-1)] - if(is.null(scon)) NA else scon
scale[nstart] - s

# Calculate recursively:
while(n = ny) {
   s - if(is.null(scon)) ((n-1)*s + abs(y[n]-alpha))/n else scon
   mu- mfn(n)/s
   eye   - if(abs(alpha - y[n])  s/n^b) 1 else 0
   g - (1-1/n)*g + n^(b-1)*eye/s
   a - max(mu,g)
   alpha - alpha + sign(y[n]-alpha)/(a*n)
   locn[n]  - alpha
   scale[n] - s
   n - n+1
}
list(locn=locn,scale=scale)
}
\name{lasr}
\alias{lasr}
\title{
  Recursive location and scale.
}
\description{
  Calculate an estimate of location, asymptotically
  equivalent to the median, and an estimate of scale
  equal to the \bold{MEAN} absolute deviation.  Both
  done recursively.
}
\usage{
lasr(y, b = 0.2, mfn = function(n){0.1 * n^(-0.25)},
 nstart = 30, scon = NULL)
}
\arguments{
  \item{y}{
  A numeric vector of i.i.d. data whose location and scale
  parameters are to be estimated.
}
  \item{b}{
  A tuning parameter (default value equal to that used by
  Holst, 1987).
}
  \item{mfn}{
  Function of the index of the data which must be positive
  and tend to 0 as the index tends to infinity.  The default
  function is that used by Holst, 1987.
}
  \item{nstart}{
  Starting values for the algorithm are formed from the first
  \code{nstart} values of \code{y}.  The default value is that
  used in Cameron and Turner, 1993.
}
  \item{scon}{
  A constant value for the scale parameter \code{s}. If this
  is provided then the algorithm becomes equivalent to the
  algorithm of Holst, 1987.
}
}
\value{
A list with entries
  \item{locn}{The successive recursive estimates of location.  The
  first \code{nstart - 1} of these are \code{NA}.}
  \item{scale}{The successive recursive estimates of scale. If
  \code{scon} is specified these are all equal to \code{scon}. Otherwise
  the first \code{nstart - 1} values are \code{NA}.}
}

\section{Remark}{
The \bold{mean} absolute deviation is used as an estimate of scale
(rather than the more expected \bold{median} absolute deviation) simply
because the former can be calculated recursively.
}

\references{
Cameron, Murray A. and Turner, T. Rolf (1993). Recursive location and
scale estimators. \emph{Commun. Statist. --- Theory Meth.} \bold{22}
(9) 2503--2515.

Holst, U. (1987). Recursive estimators of location.
\emph{Commun. Statist. --- Theory Meth.} \bold{16} (8) 2201--2226.
}
\author{
 \email{r.tur...@auckland.ac.nz}
 \url{http://www.stat.auckland.ac.nz/~rolf}
}
\examples{
set.seed(42)
y  - rnorm(1)
z1 - lasr(y)
z2 - lasr(y,scon=1)
z3 - lasr(y,scon=100)
OP - par(mfrow=c(3,1))
plot(z1$locn,type=l)
abline(h=median(y),col=red)
plot(z2$locn,type=l)
abline(h=median(y),col=red)
plot(z3$locn,type=l)

Re: [R] Savitzky-Golay Smoother

2014-09-26 Thread Gabor Grothendieck
On Fri, Sep 26, 2014 at 3:32 AM, Erick Okuto erickok...@gmail.com wrote:
 Dear Paul and Henrik,
 I have a time series with some missing data points that i need smoothed
 using Savitzky-Golay filter. Related question was asked  here
 http://thr3ads.net/r-help/2012/11/2121748-Savitzky-Golay-filtering-with-missing-data
 but no straight forward answer was posted. However, Henrik (cc'd here) did
 ask related question on smoothing for reflectance here
 http://thr3ads.net/r-help/2004/02/835137-Savitzky-Golay-smoothing-for-reflectance-data
 which i have as well been unable to follow up. I will be glad if you could
 assist me with some insights on the way forward or point to a relevant
 source of help.


Not Savitzky-Golay but if z is a time series then

library(zoo)
na.spline(z)

will fill in NAs with spline curve fits.  See ?na.spline

There are other NA filling routines in zoo too:

 ls(pattern = ^na[.], package:zoo)
 [1] na.aggregate na.aggregate.default na.approx
 [4] na.approx.defaultna.fill  na.fill.default
 [7] na.locf  na.locf.default  na.spline
[10] na.spline.defaultna.StructTS  na.trim
[13] na.trim.default  na.trim.ts



-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] quadprog::solve.QP sometimes returns NaNs

2014-09-26 Thread Benjamin Tyner
Hello,

Here is an example; hopefully it is reproducible on others' platform:

library(quadprog)

n - 66L

set.seed(6860)
X - matrix(1e-20, n, n)
diag(X) - 1
Dmat - crossprod(X)
y - seq_len(n)
dvec - crossprod(X, y)

Amat - diag(n)
bvec - y + runif(n)

sol - solve.QP(Dmat, dvec, Amat, bvec, meq = n)

print(sol$solution) # this gives all NaNs

under sessionInfo():

R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C 
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8   
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8  
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C   
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C  

attached base packages:
[1] stats graphics  grDevices utils datasets  methods  
base

other attached packages:
[1] quadprog_1.5-5

Any ideas?

Thanks
Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.